Poll: What is your experience with Bokeh?
I prefer to use Bokeh for visualizing my data
I have tried Bokeh but prefer another visualization tool
I am interested to try Bokeh for visualizing my data
I am happy with the current visualization tool I am using
[Show Results]
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python (and R) Data Visualization Using Bokeh: NIST digits
I want to preface this post by saying that Bokeh is available for both Python and R.  It was designed to provide D3-like visualization to Python.  To do this, the developers create a Python layer that talks with a JavaScript (written in CoffeeScript) layer.  This dual layering has allowed developers to replace the Python layer with R, and other languages, providing identical functionality to different languages.  That said, I am not an R user, so my experience focuses on the Python side.

I have been looking at the Scikit-Learn package but have been annoyed that the visualizations are typically using Matplotlib.  Matplotlib is a great package, with great developers working on it.  I prefer Matplotlib for making static images, for publications, but have been frustrated using it for data exploration, particularly within Jupyter Notebook.  I was introduced to Bokeh a while back and have become a big fan as it excels at data exploration, and it integrates seamlessly with Jupyter Notebooks.  For many examples of the awesome interactive visualizations you can make with Bokeh, check out the gallery. The downside about using Bokeh with Scikit-Learn is that the it is relatively new so most resources demonstrate using Matplotlib.

For this project, I wanted to figure out how to use Bokeh to visualize the digits dataset included in Scikit-Learn.  In Matplotlib, this is accomplished using the `imshow` command.  Bokeh provides the command `bokeh.plotting.figure.image_rgba` which seems to provide the closest comparable functionality.  To use this, the 4-bit digit images must be converted into a 32-bit RGBA image.  Here is how I did this:

def rgba_from_4bit(img_4):
   n, m = img_4.shape
   img_rgba = np.empty((n, m), dtype=np.uint32)
   view = img_rgba.view(dtype=np.uint8).reshape((n, m, 4))
   view[:, :, 3] = 255  # set all alpha values to fully visible
   rgba = 255 - img_4[:, :] / 16 * 255
   # rgba is upside-down, hence the ::-1
   view[:, :, 0] = view[:, :, 1] = view[:, :, 2] = rgba[::-1]
   return img_rgba

For a full example showing how this was implemented, check out my notebook on Anaconda Cloud.  With this, I can now proceed to work with the Scikit-Learn NIST digits dataset while visualizing the results in Bokeh.

I definitely recommend using Bokeh as it is great for data exploration.  It also allows for hosting interactive images on webpages, live streaming data, and many other cool features.  The biggest frustrations I have experienced as a user is lack of LaTeX support, the inability to generate vector graphics, and the minor support for creating images in a headless mode, i.e., without having an X-server.  Feel free to checkout the other notebooks I have on there, demonstrating another great package, Datashader (with some Bokeh), designed for visualizing large datasets while avoiding over-plotting.
Interesting, thanks for pointing out that bokeh works with R, I didn't know that! I like the style and structure of your notebook (comparison with matplotlib, conclusion, to do list...). Keep up the good work!
You can follow my learning club progress and get R tips here.

Forum Jump:

Users browsing this thread: 1 Guest(s)

About Becoming A Data Scientist

BecomingADataScientist.com is a blog created by Renee Teate to track her path from "SQL Data Analyst pursuing an Engineering Master's Degree" to "Data Scientist". She created this club so participants can work together and help one another learn data science. See her other site DataSciGuide for more learning resources.

Sponsored by DataCamp!