Shell – Convert powerpoint to markdown

conversionfile formatmarkdownmicrosoft-powerpointshell

Can I convert powerpoint to markdown?

My ultimate goal is to create an easily versionable version of the powerpoint. (Maybe reveal.js or something)

Best Answer

Is it possible to diff PowerPoint version-controlled with git? has, as today, two suggestions to extract text from powerpoint (but no markdown, sorry):

python-based: python-pptx
node.js-base: textract

Stop-Press - August 2014

Since Pandoc 1.13, Pandoc now contains my implementation of DokuWiki writing - and many more features are implemented there than in this script. So this script is now pretty-much redundant.

Having originally said I didn't want to write a Python script to do the conversion, I ended up doing just that.

The real time-saving step was to use Pandoc to parse the Markdown text, and write out a JSON representation of the document. This JSON file was then mostly fairly easy to parse, and write out in DokuWiki format.

Below is the script, which implements the bits of Markdown and DokuWiki that I cared about - and a few more. (I've not uploaded the corresponding test suite that I wrote)

Requirements to use it:

Python (I was using 2.7 on Windows)
Pandoc installed, and pandoc.exe in your PATH (or edit the script to put in the full path to Pandoc instead)

I hope this saves someone else some time too...

Edit 2: 2013-06-26: I've now put this code into GitHub, at https://github.com/claremacrae/markdown_to_dokuwiki.py. Note that the code there adds support for more formats, and also contains a test suite.

Edit 1: adjusted to add code for parsing code samples in Markdown's backtick style:

# -*- coding: latin-1 -*-

import sys
import os
import json

__doc__ = """This script will read a text file in Markdown format,
and convert it to DokuWiki format.

The basic approach is to run pandoc to convert the markdown to JSON,
and then to parse the JSON output, and convert it to dokuwiki, which
is written to standard output

Requirements:
 - pandoc is in the user's PATH
"""

# TODOs
# underlined, fixed-width
# Code quotes

list_depth = 0
list_depth_increment = 2

def process_list( list_marker, value ):
    global list_depth
    list_depth += list_depth_increment
    result = ""
    for item in value:
        result += '\n' + list_depth * unicode( ' ' ) + list_marker + process_container( item )
    list_depth -= list_depth_increment
    if list_depth == 0:
        result += '\n'
    return result

def process_container( container ):
    if isinstance( container, dict ):
        assert( len(container) == 1 )
        key = container.keys()[ 0 ]
        value = container.values()[ 0 ]
        if key == 'Para':
            return process_container( value ) + '\n\n'
        if key == 'Str':
            return value
        elif key == 'Header':
            level = value[0]
            marker = ( 7 - level ) * unicode( '=' )
            return marker + unicode(' ') + process_container( value[1] ) + unicode(' ') + marker + unicode('\n\n')
        elif key == 'Strong':
            return unicode('**') + process_container( value ) + unicode('**')
        elif key == 'Emph':
            return unicode('//') + process_container( value ) + unicode('//')
        elif key == 'Code':
            return unicode("''") + value[1] + unicode("''")
        elif key == "Link":
            url = value[1][0]
            return unicode('[[') + url + unicode('|') + process_container( value[0] ) + unicode(']]')
        elif key == "BulletList":
            return process_list( unicode( '* ' ), value)
        elif key == "OrderedList":
            return process_list( unicode( '- ' ), value[1])
        elif key == "Plain":
            return process_container( value )
        elif key == "BlockQuote":
            # There is no representation of blockquotes in DokuWiki - we'll just
            # have to spit out the unmodified text
            return '\n' + process_container( value ) + '\n'

        #elif key == 'Code':
        #    return unicode("''") + process_container( value ) + unicode("''")
        else:
            return unicode("unknown map key: ") + key + unicode( " value: " ) + str( value )

    if isinstance( container, list ):
        result = unicode("")
        for value in container:
            result += process_container( value )
        return result

    if isinstance( container, unicode ):
        if container == unicode( "Space" ):
            return unicode( " " )
        elif container == unicode( "HorizontalRule" ):
            return unicode( "----\n\n" )

    return unicode("unknown") + str( container )

def process_pandoc_jason( data ):
    assert( len(data) == 2 )
    result = unicode('')
    for values in data[1]:
        result += process_container( values )
    print result

def convert_file( filename ):
    # Use pandoc to parse the input file, and write it out as json
    tempfile = "temp_script_output.json"
    command = "pandoc --to=json \"%s\" --output=%s" % ( filename, tempfile )
    #print command
    os.system( command )

    input_file = open(tempfile, 'r' )
    input_text = input_file.readline()
    input_file.close()

    ## Parse the data
    data = json.loads( input_text )
    process_pandoc_jason( data )

def main( files ):
    for filename in files:
        convert_file( filename )

if __name__ == "__main__":
    files = sys.argv[1:]

    if len( files ) == 0:
        sys.stderr.write( "Supply one or more filenames to convert on the command line\n" )
        return_code = 1
    else:
        main( files )
        return_code = 0

    sys.exit( return_code )

Org-mode to Markdown converter

Newer versions of org-mode (>= 8.0) can export to markdown directly: http://orgmode.org/manual/Markdown-export.html

But you should note that there is no menu to export to markdown by default, and you need to add the following configuration to your .emacs file such that emacs automatically loads markdown export mode: https://stackoverflow.com/questions/22988092/emacs-org-mode-export-markdown

(eval-after-load "org"
  '(require 'ox-md nil t))

Best Answer

Related Solutions

Markdown – How to Convert Files to Dokuwiki on PC

Stop-Press - August 2014

Org-mode to Markdown converter

Related Question