Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
LIGO Gravitational Wave Data in iPython Jupyter Notebooks (ligo.org)
209 points by Mizza on Feb 13, 2016 | hide | past | favorite | 32 comments


If you liked this, you may like this surprisingly fascinating talk on the color scheme in Matplotlib used in this paper: https://www.youtube.com/watch?v=xAoljeRJ3lU&feature=youtu.be


This talk is great! And if you watched it all the way to the 6:30 mark, where he says "you can't have negative amounts of light", then you'll really enjoy chimerical colors: https://en.wikipedia.org/wiki/Impossible_color

The trick is you exploit retinal fatigue in order to perceive colors both "above 100%" and "below 0%". Wild stuff.


As someone who keeps getting comments about my "weird" toolkit for doing scientific research (Jupyter/Julia), this made my day.


What would people rather have you use? Excel?


You would be surprised at the number of Excel plots in the literature.

Most astrophysicists use IDL or IRAF, I've never seen anyone use Matlab. The benefits are tons of functions specifically tailored for astro data analysis. Python is gathering momentum though, there are libraries like sunpy and astropy. Plenty of IDL fanatics are floored when you can show them just how easy it is to process data with the Numpy stack.

I'm _not_ an astrophysicist, but I work with a lot of them. I use Jupyter notebook daily. It's the perfect balance between REPL python and standalone scripts. My typical workflow is to hack something together in a notebook, which lets you iterate very quickly, then once I'm happy I freeze the code into a module.


Haha, I forgot about IDL. My very first coding project as an undergrad was in IDL, until one day a new grad student showed up to the lab and said, "WTF? Do this in python."

When looking at plots in papers, there are always little giveaways for what program it was made in. Plot has horizontal gridlines, but no vertical gridlines: Excel. Plot is typeset with Arial, size 4: Matlab. Plot looks like it was sent through a fax a few times: IDL.


Did the LIGO paper use matplotlib? I thought it did but that would be too cool.


Undergrad astrophysics student here. The research classes at my institution are all taught in python. I am writing my senior thesis in python, and my advisor and his grad student do most of their work in c++ + python.


IDL is not that bad, assuming that you really understood when it generates an implicit loop and that int is just 16bit. (And double check, I wrote a few scripts that happily read the first 32000 lines of a file multiple times...)


I'm an astrophysics/CS student, and IDL has been on life support at my university for the last 3 years. Everything is either done through Fortran 90 or C/C++ for simulations code, and the data processing is done through Python with some combination of GNUplot/matplotlib/yt, or maybe R.


Oh good God how could I forget about Fortran and GNUplot. F90 though? Luxury. All your code should be in F77 :P . I've worked on a few image reduction pipelines which were all in C++.


Matlab was used a ton by my schools (UCSC) Robotics and Signals Processing groups. Simulink seemed to be the killer app there.


The two that I'm familiar with are mathematica and matlab.

Mathematica is a lisp for representing and manipulating mathematical expressions combined with an IDE that knows 2D layout (so you can write expressions like you would on paper) and a massive integrated library of mathematical routines. The "gateway drug" is its ability to symbolically integrate, differentiate, factor, simplify expressions, solve equations, interactively plot without explicitly sampling, etc. Then you discover that all it's "heavy lifting" capabilities are integrated with each other -- i.e. you could use a piece-wise implicit surface to define boundary conditions for a differential equation, solve it with finite element on 20 different tessellation levels, and compare the results using a norm built out of an interpolator and integrator to check for convergence. All in a handful of lines of code where you only have to worry about high-level details rather than dozens of for loops and hundreds of lines of glue. I really don't think there's anything comparable in the open source ecosystem yet, but I'd love to be wrong (yes, I know about SAGE).

Matlab is relatively unremarkable as a language -- it's not a lisp, it deals with matrices of floats not expressions, and its only competitive language feature is the eponymous set of linear algebra primitives that CS-trained language lawyers tend to roll their eyes at but that really do make a difference for the scientists and engineers who use it day-to-day. The killer value proposition, though, is its collection of libraries. They're not symbolic like what you would find in Mathematica but they're usually more extensive and relentlessly practical. Sometimes that means speed, sometimes that means features which cater to your particular obscure workflow, sometimes it means integration, but it always seems to result in a decision along the lines of "I could spend a day munging python libraries A, B, C, and D together, or I could open matlab which already has a package and a GUI for it."

Python, Julia, and R do many things very well. They can beat mathematica/matlab in a number of areas but there are still huge swaths of math/science/engineering where they're just not competitive. That goes double when you take into account legacy code. It's changing slowly, but science is a highly competitive environment which is not keen on rewarding contributions of this sort, so it could be quite a while before they catch up.


Mathematica is no Lisp. Mathematica's execution model is sowmwhat based on term rewriting.

Lisp is based on an evaluation model, a little bit inspired by lambda calculus.


Fair enough, I should have said "LISP-inspired."


matlab, mathematica, jupyter/python or jupyter/R I expect


Donald Knuth is probably swimming in joy right now. This so literate programming (or should I say literate research)...


https://github.com/minrk/ligo-binder

Re-run the analysis yourself on Jupyter using Binder. Click the launch binder button.


Just curious on binder specific here. I could never got it to work out-of-the-box. In this example, it gave me "ImportError: No module named 'seaborn'" error. Now I could do something like !sudo pip install seaborn but still... Did you guys have same experience?


can anyone tell me how the figured out where the black holes are?

with just 2 "ears" I'd expect to be only able to determen a circle, but here is the picture they released: http://content.screencast.com/users/cougarten/folders/Jing/m...

Do i see a warped circle, or the bottom part of one? In the latter case I wonder how they found out.

Are the L-shape sensors capable of seeing some direction depending on which way the phases shift first/last?

If you'd build one of the sensors in reverse you'd see a reversed signal, no? Given that they will have optimized the orientations of both stations this is probably how it worked and I'm seeing just part of a circle, right? That one lonely blob might be an unlikely mirrored version.


For a moment I thought "notebook" meant like laptop.

And I was like, wow that is amazing, they crowdsourced data from all over the world.

Then I remembered the sensitivity of the instruments used and how dumb I was.

But it really would be cool if oneday people had smartphones so advanced they could contribute to worldwide collection of data that needed a huge area to sample. I think they are talking about doing that for earthquake alerts.


WeatherSignal is doing something along those lines for weather data: http://weathersignal.com/about/ I'm excited to see where this kind of tech goes.


http://crayfis.io/

Use your smartphone camera to create a global network for detecting cosmic ray showers!


Makes for a nice demo of Python Notebook capabilities, especially when it comes to graphing.


As a python fanboy this makes me really happy. Great job, folks at PSF and Jupyter


Note the sample code is in Python 2.7 not 3.x. In general you should avoid 3.x for scientific computing as it is slower and less supported in academia.


This is a self-fulfilling prophecy.

Someone tries to write an analysis routine in 3.x, but then is quickly told "nobody in science uses 3.x, because of <vague reasons>". Result: everyone sticks to 2.7, so nobody ever even tries to build against 3.x when writing packages.


I switched to python3 after measured a large ETL process converted. As scientific packages are rely on FFI and the actual code is running outside of python I'm puzzled with your comment about speed.


Neither of these assertions are true. I use Py3 to teach graduate-level scientific computing and machine learning with great success, and the scientific Python community is making a very strong push for adoption. All major packages in the scientific stack now support Python 3.


That's interesting, do you have a good source on that? I switched to 3.x for development but I do academic projects too.


I don't know about speed, most of the modules where speed matters, like Numpy, rely on calling C routines somewhere down the line. The version of Python shouldn't make a significant difference.

Numpy, Scipy and Pandas work with Python 3, Jupyter certainly does. That covers probably 80% of scientific grunt work. For specialist applications, OpenCV 3.0 works, as does scikit-learn/image. I can't speak for other development like web though.

I think the main problem is that if you're using someone else's code in academia, it's likely to be written in 2.7 and you would have to go through and update everything that's not back-compatible.


slower.. <citation needed>




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: