Note: watch my live coding session of this article:
Intro
If you've heard some rumors of Emacs that it has a very steep learning curve (or that Emacs makes a computer slow), you may be too scared to look at it. It indeed has some learning curve (learning anything does have one), but it isn't very steep. I learned this after getting my hands dirty with Emacs a few years ago.
Anyway, if you're still curious about it, here I want to show you what it is like to write a python script in Emacs, using the Elpy (Elpy Docs) package.
The task is to join every single line of a few files into a new line of a result file, separated by a user-specified delimiter.
For example, if I have three text files as below:
$ cat software.txt
GNU Emacs
Linux
Python
$ cat software-author.txt
Richard Stallman
Linus Torvalds
Guido van Rossum
$ cat software-birthdate.txt
1989-03-20
1991-09-17
1991-02-20
Now I want to merge them into a new CSV (comma-separated values) file, all I have to do is typing: python merge-files.py software-info.csv , software.txt software-author.txt software-birthdate.txt
And software-info.csv
will have the result as follows:
$ cat software-info.csv
GNU Emacs,Richard Stallman,1989-03-20
Linux,Linus Torvalds,1991-09-17
Python,Guido van Rossum,1991-02-20
You may ask, what's the use scenario of this script?
The idea came up when I was completing a data processing task of CSV files at work, which required me to decrypt some encrypted fields of data dumped from a database. With this script and some shell scripting, I first split the fields into different files, decrypted specific files, and then joined them back into a CSV file.
Writing the Python script
The requirement is quite straightforward, the next step is to design the script and implement it.
Here I want to write the core function in functional programming, so that despite reading and writing files, the core function is a pure function without any external dependencies (it only depends on its arguments), which has a signature as following:
def merge_multi_lines(separator, *multi_lines):
'''Merge every line of `multi_lines` into a new line, which are
delimited by the separator.'''
merged_lines = []
...
return '\n'.join(merged_lines)
And I can then evaluate the function and test it with some simple test cases to quickly fix bugs. If you don't know, this is the workflow of interactive programming, which helps us test the snippet as early as possible in the process of development. So We have to postpone the testing until the whole program is finished.
Performance
Since this script loads all the lines in memory, it may be slow for large files. If you encounter this performance problem, you can also try this awk snippet:
awk -v OFS=',' '
NR==FNR { col[FNR]=$0; next }
{ print col[FNR], $1 }
' part1.txt part2.txt > joined.txt
Elpy
Next, I want to highlight some commands of Elpy, which makes interactive programming possible along the trip. (To make it simple, I recommend you to create a dedicated virtualenv for it and its dependencies.)
Interactive programming support:
M-x elpy-shell-send-defun (C-c C-y f)
M-x elpy-shell-send-statement (C-c C-y e)
M-x elpy-shell-switch-to-shell (C-c C-z)
Code navigation:
M-x elpy-goto-definition (M-.)
M-x xref-pop-marker-stack (M-,)
M-x elpy-occur-definitions (C-c C-o)
Documentation:
M-x elpy-doc (C-c C-d)
P.S. Elpy is great, but the maintainer is too busy to maintain it. If you can help, don't hesitate to get in touch with the maintainer (@galaunay).