Install programs
I installed ipython-3-notebook (in Debian Jessie) from the synaptic package manager.In order to install the R module, I installed PIP for python 3 in the synaptic package manager. PIP is the Python Package Index, a module installation tool. Then I used pip3 to install rpy2
sudo pip3 install rpy2There is a blog post on how to avoid using sudo to install pip modules.
Install statsmodel, a module for statistical modelling and econometrics in python. Maybe I should have installed python-statsmodels as a Debian package instead? But I it seems to be linked to python 2.x instead of python 3 (it had a dependency on python 2.7-dev). Therefore I installed statsmodels with pip3, using the --user flag mentioned above to install is as a user only module.
pip3 install --user statsmodelsThe installation took several minutes on my system. It seemed to be installing a number of dependencies. Many warnings about variables defined but not used were returned but the installation kept running. The final message was:
Successfully installed statsmodels numpy scipy pandas patsy python-dateutil pytz
Cleaning up...
Starting the Ipython notebook
Move to a directory where the notebooks will be stored, start a ipython notebook kernelcd python
ipython3 notebook
Shortcuts
See also the Ipython Notebook shortcuts. Useful shorcuts are ESCAPE to go in navigation mode, ENTER, to enter edit mode. It seems one can use vim navigation keys j and k to move up and down cells. Pressing the "d" key twice deletes a cell. CTRL+ENTER run cell in place, SHIFT+ENTER to run the cell and jump to the next one, and ALT+ENTER to run the cell and insert a new cell below.Run R commands in the Ipython notebook
Load an ipython extension that deals with R commands
%load_ext rpy2.ipythonDisplay a standard R dataset
%R head(cars)Use data from the python statsmodels module based on this page.
%R plot(cars)
import statsmodels.datasets as sdPrint column names of the dataset
data = sd.longley.load_pandas()
print(data.endog_name)Print a dataset as an html table by simply giving its name in the cell. For example this data frame contains exogenous variables:
print(data.exog_name)
data.exogPython can pass variables to R with the following command:
totemp = data.endogEstimate a linear model with R
gnp = data.exog['GNP']
%R -i totemp,gnp
%%RPlot the datapoints and linear regression with the ggplot2 package
fit <- br="" gnp="" least-squares="" lm="" nbsp="" regression="" totemp="">print(fit$coefficients) # Display the coefficients of the fit.
plot(gnp, totemp) # Plot the data points.
abline(fit) # And plot the linear regression.->
%%R
library(ggplot2)
ggplot(data = NULL, aes(x =gnp, y = totemp)) +
geom_point() +
geom_abline( aes(intercept=coef(fit)[1], slope=coef(fit)[2]))
No comments:
Post a Comment