This post is a review of IPython Notebook Essentials by L. Felipe Martins.
IPython Notebook is a very cool environment for running Python code. I've flirted with it in the past, but not really got to grips with it. So I was pleased to be contact by Packt Publishing with the offer of a free copy (ebook) to review.
In it's own words, the book is for:
software developers, engineers, scientists, and students who need a quick introduction to the IPython notebook for use in scientific computing, data handling, and analysis, creation of graphical displays, and efficient computations.
Overall, I think it succeeds pretty well. Taking the time to go through the book, rather than just trying to get things done as fast as possible, certainly meant that I learnt things wouldn't have done otherwise.
The book starts helpfully with a chapter that shows off the functionality of IPython Notebook — “A Tour of the IPython Notebook”, which seems to a be a really good idea for the intended audience. It starts with providing some good ways to access it - either installed on your machine using Anaconda, which provides a massive set of useful libraries, or via Wakari.
This chapter includes a classic problem I've thought about in the past — when is the optimal time to add cream/milk to your tea/coffee to keep it hot? In fact the book is full of examples that would be fairly familiar to different types of science students. It then works it right through to generating 3D plots.
There are lots of helpful notes in the book about best practices and gotchas, such as tips about namespace pollutions, ints/floats, indentation, exponentiation, and meaningful variables. Very occasionally I thought there were words that might go over the heads of the intended audience, but generally I though it was really good.
Chapter two goes into a lot more detail about the IPython Notebook interface and features. I learnt a lot, including how to use LaTeX and Markdown in the Notebook, and even that you can convert IPython Notebooks into nice-looking HTML slides. I was glad for a book that kind of forced me to learn about the tool, rather than my previous approach which was much more oriented to the immediate task.
It continues with a section on IPython ‘magics’ which was also really good, and has encouraged me to explore further what is available.
This chapter also contained a lot of useful information for scientists, including things like quick ways to time and optimise your code, how to load/save data into the environment, and lots of important tips regarding possible gotchas.
One tip I think they probably should have included under the
%load magic was
a security one — loading and running arbitrary code from the internet is, of
course, pretty dangerous, and there is virtually no limit to what Python code
can do on your machine. With Python running in a Notebook environment, I think
it might be easy for scientists to forget about the need to protect themselves
from malicious code.
Chapter three focuses on creating plots with
matplotlib, and goes into a
lot of detail about 2D and 3D plots. I think it provides some descriptions and
help that you don't get just by browsing example code on matplotlib.org. It also includes examples of animations.
Chapter four provides a great tutorial on using pandas for loading and manipulating data sets. I thought this was very well done, and included some more advanced manipulations that you are very likely to need when input data isn't in the right ‘shape’. There were tips and shortcuts that I probably wouldn't have found otherwise.
Chapter five goes on to “Advanced Computing with SciPy, Numba, and NumbaPro”. I didn't try out this chapter as fully as the others, as I don't have a CUDA compatible GPU that I could use for the examples. However, what I followed regarding scientific computing and numerical calculations was very helpful.
The book finishes with 3 appendices:
Appendix A. IPython Notebook Reference Card
Appendix B. A Brief Review of Python
Appendix C. NumPy Arrays
These all seemed to be just right for the typical non-programmer who nevertheless needs a reasonable grasp of Python and NumPy. As well as the appendix B, there were some helpful ways that the power of Python was demonstrated and explained in the earlier chapters e.g. use of first class functions and factory functions, which can be really important when needing to parameterise functions by one variable and plot the result against another.
One thing the book didn't explain in perhaps enough detail (that I saw),
although it does allude to it, is the problem of namespace pollution. This rears
its ugly head when
%pylab is used, or any
import *, and you get nasty
numpy.sum shadowing the builtin
sum. I did trip up over
this, and it can be quite nasty to work out what is going on, especially when
the broadcasting rules for NumPy arrays can be rather magical. Sometimes
will give you working code and
numpy.sum will not, and many people will not
be aware that they are actually using
In general, the attention to detail was good. I found one section where the example code did not work, and I've submitted errata for it.
By the end of the book, I realised I could use IPython Notebook to re-implement an actual science project — my final year project at University, which is a long time ago now (2002)! The project involved calculating electric fields and simulating electron movements in a scanning electron microscope. I did it originally using a Mathematica-like system call MuPad which could do both symbolic and numerical manipulations.
Re-doing this as an IPython Notebook was quite fun — and a good chance to practice and embed my newly learned 3D plotting skills. It did highlight some ways that Python is not perfect for this task. (Like the integration with symbolic manipulation — which is not as smooth as in a language which was designed for it).
But one of the really nice things about IPython Notebook is that you can write LaTeX easily, so that your code can be interspersed with the mathematical formula presented nicely, instead of using ASCII art etc. I daresay that it won't be long before IPython Notebooks will be a preferred way of submitting this kind of work.
As well as embedding maths, you can of course embed the cool plots:
For reference and explanation, you can have a look at my original, brilliantly-titled SEM Dopant Contrast due to Patch Fields in Semiconductors. (It does include the main source code at the end).
This little project was also a neat demonstration of Moore's law. The project I did at University in 2002 was a follow-up from a 1969 project also at Cambridge (Plows GS, Stroboscopic Scanning Electron Microscopy and the observation of microcircuit surface voltages), in which it looks like a large amount of time was taken up to calculate (using pen and paper, I think) the trajectory of a single electron in a Scanning Electron Microscope under certain conditions.
For my own project in 2002, I used my PC to do thousands of such calculations, but I ended up thrashing my computer’s CPU for several weeks on end. Based on what I've done the past couple of days, however, the same calculations would be complete within an hour on my current laptop. And that was with using only one out of 4 cores.
I think this book probably hits the spot just right for its intended audience, and I think that is a spot which needs hitting — I can see IPython Notebook becoming an indispensable tool for many scientists and students.