Exporting

The tradeoffs of using an interactive programming environment

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [2]:
import json,re
from pathlib import Path
import io

Let's start by simply grabbing a ipynb file

In [7]:
fname = Path('01_tensors_matmul.ipynb'); fname
Out[7]:
WindowsPath('01_tensors_matmul.ipynb')

ipynb files are simply JSON plain text format. So we can load it and it gives us a dictionary.

In [8]:
fname_out = f'nb_{fname.stem.split("_")[0]}.py'; fname_out
Out[8]:
'nb_01.py'
In [9]:
main_dic = json.load(open(fname,'r',encoding="utf-8"))
In [10]:
main_dic.keys()
Out[10]:
dict_keys(['cells', 'metadata', 'nbformat', 'nbformat_minor'])

Looking at the cells key we can see each cell of the jupyter notebook and what properties it has.

In [11]:
main_dic['cells'][0]
Out[11]:
{'cell_type': 'code',
 'execution_count': 1,
 'metadata': {},
 'outputs': [],
 'source': ['# default_exp nb_01']}

Now we want to be look at each cell and determine if it should be exported as part of our module's functions.

We do this by putting #export in any cell we want to export and using a function to read each cell in the loaded notebook and return True or False depending on if the export tag is present or not.

In [12]:
def is_export(cell):
    # if the cell is not code return false
    if cell['cell_type'] != 'code': return False
    src = cell['source']
    # if the len
    if len(src) == 0 or len(src[0]) < 7: return False
    return re.match(r'^\s*#\s*export\s*$', src[0], re.IGNORECASE) is not None
In [13]:
code_cells = [c for c in main_dic['cells'] if is_export(c)]

Now we have a list of all the cells marked for export

In [14]:
code_cells[0]
Out[14]:
{'cell_type': 'code',
 'execution_count': 28,
 'metadata': {},
 'outputs': [],
 'source': ['#export\n',
  'import operator\n',
  '\n',
  'def test(a, b, comp, cname=None):\n',
  '    if cname is None: cname = comp.__name__\n',
  '    assert comp(a,b), f"{cname}: \\n{a} \\b{b}"\n',
  '\n',
  'def test_eq(a,b): \n',
  '    test(a,b, operator.eq, "==")']}

Lastly, we can iterate through the list, grab the code put into a string called module and export it as a python file.

In [15]:
module = f'''
#################################################
### THIS FILE WAS AUTOGENERATED! DO NOT EDIT! ###
#################################################
# file to edit: dev_nb/{fname.name}

'''

for cell in code_cells: 
    module += ''.join(cell['source'][1:]) + '\n\n'
In [16]:
module
Out[16]:
'\n#################################################\n### THIS FILE WAS AUTOGENERATED! DO NOT EDIT! ###\n#################################################\n# file to edit: dev_nb/01_tensors_matmul.ipynb\n\nimport operator\n\ndef test(a, b, comp, cname=None):\n    if cname is None: cname = comp.__name__\n    assert comp(a,b), f"{cname}: \\n{a} \\b{b}"\n\ndef test_eq(a,b): \n    test(a,b, operator.eq, "==")\n\n\ndef near(a,b): \n    """Test if two tensors are nearly identical"""\n    return torch.allclose(a,b, rtol=1e-03, atol=1e-05)\n\ndef test_near(a,b): \n    test(a,b, near)\n\n'

We now have a quick way to export our functions and build as we go using jupyter notebooks.