Thursday, December 26, 2013

Python-pandas importing a data frame from MySQL

I wanted to load tables from a mysql database and to run analyses on them. Had already done some analysis on R, but wanted to make them portable to a website, and thought that python would be better suited for that. The version of pandas currently shipped with Ubuntu is outdated 0.7. I had to use another method to get a newer version. Pandas source code is currently hosted on GitHub at: http://github.com/pydata/pandas 
After a
    sudo apt-get install python-pip 
 I installed pandas via ``pip``::
    pip install --upgrade pandas 
Still my script with
    dtf = pandas.io.sql.read_frame("SELECT * FROM Table", db) 
was returning an error, expecting list got tuple.
And  pandas.__version__ was still at 0.7.0.
I uninstalled the python-pandas package.:
    sudo apt-get remove python-pandas
And ran again
    sudo pip install pandas
After that, pandas.io.sql.read_frame() was working as expected.
And dtf.head() showed me a proper vue of the table.
Columns can be selected with dtf.columnname or dtf['columnname'].

Tuesday, December 17, 2013

Lyx article editing tips

The Lyx "What You See is What You Mean" editor is available at Lyx and from your package manager.

Citations

Lyx nicely integrates with a bibtex bibliography. You can create a shortcut such as CTRL+SHIFT+C to insert a reference. A small menu will pop-up to choose from the references.

Unfortunately  "BibTeX does not support files encoded in UTF-8". To get characters such as "é" encoded correctly latin1 has to be used.This can be set as the database property in Jabref.

Shortcuts

Under Tools / preferences / Editing / Shortcuts /  Stackoverflow question on how to find the command name for a shortcut. I mapped:
  • citation-insert to Ctrl+Alt+C 
  • href-insert to Ctrl+K
  • label-insert to Ctrl+Alt+L
  • statistics to Ctrl+Alt+T
See also the Lyx wiki page on shortcuts.

Merge and filter bibliographies 

Answers to a question on merging 2 bibtex file recommend using Jabref file / import into current database.

Create a shorter bibliography with only cited documents from another bibliography Extracting only the BibTex records relevant to the paper. Answers recommend using bibtool or bibexport.

I have written an article in Lyx which cites 2 bibliographies and a colleague asked to have a .tex version of this article. Here is my recipe:
  1. Use Jabref to merge two bibliographies. 
  2. Export Lyx file to LaTeX (PDFLaTeX).
  3. Edit the .tex file to insert a link to the merged bibliographies. 
    1. \phantomsection\addcontentsline{toc}{section}{\refname}\bibliography{long.bib} 
  4. Compile the .tex with TeXstudio. 
  5. Used bibexport to extract a short bibliography corresponding only to the citations used in one article.
bibexport -o short.bib article.aux
  1. Replace long.bib by short.bib in the .tex file.
I have done it 3 times already. Could I create a script that does all this automatically?

Spell checker

A spell checker can be activated under Tools / preferences / language settings / spellchecker. On my system the only spell check engine available is Enchant, "a wrapper that provides automatic access to many spell checking libraries (ispell, aspell, hspell, hunspell, etc.)".

The button displaying a red-underlined A can be activated to spell check continuously.

Language of document

Under Document / Settings / Language, it's possible to change the document language, which will have an impact on the language used for spell checking. In my system compiling a document in French returned an error:
"Package babel Error: Unknown option `french'."
This stackexchange post suggest installing the texlive-lang-french Debian package. This fixed the error.

Math mode

Press CTRL+M at the cursor or on highlighted text to enter math mode. Latex syntax "\beta", "x_t" can be used to enter equations and is displayed directly as math symbols.

Change input encoding

This question on "how do I prevent  Lyx from loading inputenc?" explains where this part of the Latex preamble is defined.
In Document > Settings..., under Language > Encoding, choose "Other" and select "Language Default (no inputenc)".

Instead I replaced Latin1 by utf8 as input encoding. This works for the moment.

Encoding issues and Greek symbols used outside math mode

"Encoding file 'lgrenc.def' not found"  was fixed by installing the debian package texlive-lang-greek (as explained here on stackoverflow). Other errors appeared related to the use of beta signs outside math mode. As one user explained on stackoverflow, placing \beta signs in math mode usually fixes the issue.

Hyperlinks in PDF

How to change hyperlink colors in Lyx
Document > Settings > PDF Properties under the Additional options field. For example, a custom set of instructions may be the following: linkcolor=black, urlcolor=blue, citecolor=green, pdfstartview={FitH}, hyperfootnotes=false, unicode=true 
I changed marineblue to blue, as my system complained that marineblue was not available.

The lyx function that creates hyperlinks is called href-insert. You can create a CTRL+K shortcut that inserts hyperlink under Tools / preferences / Editing / Shortcuts / "Cursor, Mouse and Editing functions" by inserting the function name: href-insert.

Presentation template, called beamer

The beamer theme and color theme can be changed by inserting this in Document / Seetings / Latex preamble:
\usetheme{Boadilla}
\usecolortheme{orchid}
An overview of themes is available in the beamer theme matrix.
Here is an explanation on how to change to any other RGB colour. Beware the definition of RGB colours are unusual in Latex, divide values by 255.
\definecolor{darkgreen}{rgb}{.2,.4,.2}
\usecolortheme[named=darkgreen]{structure}

Slide numbering can be added to the beamer template. By inserting this code block in Document / Settings / latex preamble:
\addtobeamertemplate{navigation symbols}{}{%
    \usebeamerfont{footline}%
    \usebeamercolor[fg]{footline}%
    \hspace{1em}%
    \insertframenumber/\inserttotalframenumber
}

Tables too wide for a slide can be resized by placing the table between the following ERT. If there is a table float, this instruction has to be placed inside the table float.


}

 Elsevier article template

Elsevier provides a latex template for articles called elsarticle which is usable in Lyx. However compiling with this template doesn't seem to work when using the hyperref package with the general option "automatically fill header". After deactivating this option, compilation works again.

The elsarticle template uses numerical references by default. it is possible to activate author year reference according to this stack exchange answer.

 Tips

  • CTRL+D uses yap, a faster (on windows) previewer than the default PDF one. There is no noticeable difference on Linux.

To produce a meeting report with a wider template than the default article.
  • I would have liked to find an overview of the templates with screenshots of PDF results. But I couldn't find such overview online. LyX wiki has a list of examples and a list of layouts
  • In fact all that was needed was to edit the Document \ Settings \ Page margins, I set the inner and out margins to 2 cm and was happy with the result.

knitr and Lyx

Export to Microsoft Word compatible format RTF

Under File / Export / Rich Text Format, it's possible to export Lyx files to a format compatible with Microsoft Word.
My Lyx file uses many packages and formatting options. Export to RTF didn't work and returned the error:
"An error occurred while running: latex2rtf"
I made a copy in a temporary directory (/tmp) and removed external packages and options in the latex preamble, before performing the conversion. Then the conversion to RTF worked.

Compare different versions of a document

Use Tools/Compare to highlight differences between a new and an old document. Then show changes in output to show changes in red and blue in the PDF output.
Alternatively, you can export the Lyx files to Latex and use Latexdiff as such:
latexdiff draft.tex revision.tex > diff.tex

Avoid complex tables or figures in the diff, as they can cause issues. I had to create a fake old documents with final figures and table for latexdiff to return a usable output.

Monday, December 16, 2013

R display an estimated equation as a latex formula with coefficients in parenthesis

In a Rnw file this function can display an estimated equation in latex form with coefficients in parenthesis.

Wednesday, December 11, 2013

MySQL commands

As root, create a database and grant permissions to a new user

To connect to the mysql client as root
mysql -u root -p
You'll be prompted for the password.

How to create a new user
CREATE USER 'newuser'@'localhost' IDENTIFIED BY 'password';
Create a database
create database databasename
Connect to a database
connect databasename
 Grant permissions to a user
GRANT ALL PRIVILEGES ON * . * TO 'newuser'@'localhost';
Then you can log out of mysql
quit

As a user 

To connect to the mysql client as a user
mysql -u username -p
If you have the correct privileges, you can also create a database with a pipe. How to create a database
echo "create database databasename" | mysql -u username -p
Load a dump
cat file.sql | mysql -u username -p databasename

Graphical client

On Ubuntu, I used mysql navigator by shi bok jang.
On Debian, I used  mysql workbench

More on users

Change password, as the user itself or as root:
SET PASSWORD FOR 'Karl'@'localhost' = PASSWORD('cleartext password');
As root , list all users
SELECT User FROM mysql.user;
As root , delete a user
drop user Rasdfas@localhost;
MySQL documentation on adding user accounts.

More on databases

Commands below work if your shell user name is the same as your mysql user name. If it's dufferent add -u username to the command.

List all databases (in mysql client)
show databases
Delete a database (in the mysql client)
DROP database databasename;
Rename a database  (in the shell)
mysqladmin -p create new_database
mysqldump -p old_database | mysql -p new_database
After you have verified that everything is in order
drop database old_database
Move a table from one database to another
mysqldump -p database_1 table_name | mysql -p database_2 
Back up only part of a database with the where option
mysqldump -p -w"productcode=440799" tradeflows raw_flow_yearly > sawnwood99raw.sql
Back up only the structure of a database (not the data)
mysqldump -p -d tradeflows > tradeflows.sql
Load a dump into a database :
cat file.sql | mysql -u username -p databasename

Configuration file my.cnf

It's not desirable to share user name and password in software disseminated over the internet. User name, password and database names can be placed under groups in the configuration file. For example for a given project enter this group in ~/.my.cnf
[project_name]
user = user
password = password
host = localhost
database = dbname
Then simply specify group=project_name to the client that accesses the database.

Data Visualisation Tools

A list of Data visualisation tools I've tried.

Desktop tools
  • R with ggplot2 package
  • Excel with pivot tables and charts
Web tools:
Sample websites:
Data storage tools:

Tuesday, December 03, 2013

Presentation with Beamer and Rnw

Copied from this post by Paul Hiemstra quoting a presentation by Yihui Xie.
There was a slight mistake in this presentation made with beamer and Rnw file. The code chunk options were not quoted properly. I corrected this in the code below and now it works.


\documentclass{beamer}
% Inspiration from
% http://www.r-bloggers.com/r-and-presentations-a-basic-example-of-knitr-and-beamer/

\begin{document}

\title{A Minimal Demo of knitr}
\author{Yihui Xie}

\maketitle

\begin{frame}[fragile]
You can test if \textbf{knitr} works with this minimal demo. OK, let's
get started with some boring random numbers:

<>=

set.seed(1121)
(x=rnorm(20))
mean(x);var(x)
@
\end{frame}

\begin{frame}[fragile]
The first element of \texttt{x} is \Sexpr{x[1]}. Boring boxplots
and histograms recorded by the PDF device:

<>=
## two plots side by side (option fig.show=hold)
boxplot(x)
hist(x,main='')
@
\end{frame}

\begin{frame}[fragile]
Plots
<>=

## two plots side by side (option fig.show=hold)
boxplot(x)
hist(x,main='')
@
\end{frame}

\end{document}