Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a post from my Linkedin page on my hopes for Jupyter notebooks and git. Anyone know of progress along this line?

#Jupyter notebook and git

As much as Jupyter Notebooks have been a great tool for data science, the transition to deployment, and the general software engineering friendliness of Jupyter Notebooks could use some work. From time to time, I have explored how others have dealt with turning notebooks into an organized codebase and outputs. To date, I have not found a comfortable approach for me. The ideal approach for me would be to use something like 'node metadata' in the way of [Leo Editor](https://leo-editor.github.io/leo-editor/) to function as 'decorators' for a notebook cell for integration with git.

By this I mean using something like special markers in Python comments (since much of data science is done with Python) to map the content of a cell (or output) to a git repository. Better yet, define a special cell type for git metadata preceding a code cell. Then implement some basic git operations on the contents of a cell. Let's suppose we use @@git as a marker for metadata in comments for git. --- beginning of cell --- # @@git %upstream%=https://github.com/pyro-ppl/pyro # @@git %local%=~/repo/pyrodev # @@git %branch%=burnburnburn # @@git %file%=examples/cvae/util.py

# Here begins the contents of the util.py file ... --- end of cell ---

An extension would implement items in the menubar for various git operations: stage - stage the content as util.py file checkout - checkout from upstream, replace local copy, and refresh content of cell commit - commit stage file specified by %file% status - ...

Imagined workflow is that once a working idea scattered throughout a notebook has been sketched out, the user would mark the notebook cells that should be mapped to files in a git repository. Also this could be used in a mixed dev/data science environment where library code under development can be pulled right into a notebook.

Yes, there will be problems with committing code with comments that are specific to one user which is why a special cell type makes sense. Yes, there will be problems that I can't even imagine right now but ...

Please message me if you know of a cell-based git extension for Jupyter Notebooks.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: