This is a post from my Linkedin page on my hopes for Jupyter notebooks and git. Anyone know of progress along this line?
#Jupyter notebook and git
As much as Jupyter Notebooks have been a great tool for data science, the transition to deployment, and the general software engineering friendliness of Jupyter Notebooks could use some work. From time to time, I have explored how others have dealt with turning notebooks into an organized codebase and outputs. To date, I have not found a comfortable approach for me. The ideal approach for me would be to use something like 'node metadata' in the way of [Leo Editor](https://leo-editor.github.io/leo-editor/) to function as 'decorators' for a notebook cell for integration with git.
By this I mean using something like special markers in Python comments (since much of data science is done with Python) to map the content of a cell (or output) to a git repository. Better yet, define a special cell type for git metadata preceding a code cell. Then implement some basic git operations on the contents of a cell. Let's suppose we use @@git as a marker for metadata in comments for git.
--- beginning of cell ---
# @@git %upstream%=https://github.com/pyro-ppl/pyro
# @@git %local%=~/repo/pyrodev
# @@git %branch%=burnburnburn
# @@git %file%=examples/cvae/util.py
# Here begins the contents of the util.py file
...
--- end of cell ---
An extension would implement items in the menubar for various git operations:
stage - stage the content as util.py file
checkout - checkout from upstream, replace local copy, and refresh content of cell
commit - commit stage file specified by %file%
status - ...
Imagined workflow is that once a working idea scattered throughout a notebook has been sketched out, the user would mark the notebook cells that should be mapped to files in a git repository. Also this could be used in a mixed dev/data science environment where library code under development can be pulled right into a notebook.
Yes, there will be problems with committing code with comments that are specific to one user which is why a special cell type makes sense. Yes, there will be problems that I can't even imagine right now but ...
Please message me if you know of a cell-based git extension for Jupyter Notebooks.
#Jupyter notebook and git
As much as Jupyter Notebooks have been a great tool for data science, the transition to deployment, and the general software engineering friendliness of Jupyter Notebooks could use some work. From time to time, I have explored how others have dealt with turning notebooks into an organized codebase and outputs. To date, I have not found a comfortable approach for me. The ideal approach for me would be to use something like 'node metadata' in the way of [Leo Editor](https://leo-editor.github.io/leo-editor/) to function as 'decorators' for a notebook cell for integration with git.
By this I mean using something like special markers in Python comments (since much of data science is done with Python) to map the content of a cell (or output) to a git repository. Better yet, define a special cell type for git metadata preceding a code cell. Then implement some basic git operations on the contents of a cell. Let's suppose we use @@git as a marker for metadata in comments for git. --- beginning of cell --- # @@git %upstream%=https://github.com/pyro-ppl/pyro # @@git %local%=~/repo/pyrodev # @@git %branch%=burnburnburn # @@git %file%=examples/cvae/util.py
# Here begins the contents of the util.py file ... --- end of cell ---
An extension would implement items in the menubar for various git operations: stage - stage the content as util.py file checkout - checkout from upstream, replace local copy, and refresh content of cell commit - commit stage file specified by %file% status - ...
Imagined workflow is that once a working idea scattered throughout a notebook has been sketched out, the user would mark the notebook cells that should be mapped to files in a git repository. Also this could be used in a mixed dev/data science environment where library code under development can be pulled right into a notebook.
Yes, there will be problems with committing code with comments that are specific to one user which is why a special cell type makes sense. Yes, there will be problems that I can't even imagine right now but ...
Please message me if you know of a cell-based git extension for Jupyter Notebooks.