Working with Documents¶

python-docx allows you to create new documents as well as make changes to existing ones. Actually, it only lets you make changes to existing documents; it’s just that if you start with a document that doesn’t have any content, it might feel at first like you’re creating one from scratch.

This characteristic is a powerful one. A lot of how a document looks is determined by the parts that are left when you delete all the content. Things like styles and page headers and footers are contained separately from the main content, allowing you to place a good deal of customization in your starting document that then appears in the document you produce.

Let’s walk through the steps to create a document one example at a time, starting with two of the main things you can do with a document, open it and save it.

Opening a document¶

The simplest way to get started is to open a new document without specifying a file to open:

from docx import Document document = Document() document.save('test.docx') 

This creates a new document from the built-in default template and saves it unchanged to a file named ‘test.docx’. The so-called “default template” is actually just a Word file having no content, stored with the installed python-docx package. It’s roughly the same as you get by picking the Word Document template after selecting Word’s File > New from Template… menu item.

REALLY opening a document¶

If you want more control over the final document, or if you want to change an existing document, you need to open one with a filename:

document = Document('existing-document-file.docx') document.save('new-file-name.docx') 

Opening a ‘file-like’ document¶

python-docx can open a document from a so-called file-like object. It can also save to a file-like object. This can be handy when you want to get the source or target document over a network connection or from a database and don’t want to (or aren’t allowed to) interact with the file system. In practice this means you can pass an open file or StringIO/BytesIO stream object to open or save a document like so:

f = open('foobar.docx', 'rb') document = Document(f) f.close() # or with open('foobar.docx', 'rb') as f: source_stream = StringIO(f.read()) document = Document(source_stream) source_stream.close() . target_stream = StringIO() document.save(target_stream) 

The 'rb' file open mode parameter isn’t required on all operating systems. It defaults to 'r' which is enough sometimes, but the ‘b’ (selecting binary mode) is required on Windows and at least some versions of Linux to allow Zipfile to open the file.

Okay, so you’ve got a document open and are pretty sure you can save it somewhere later. Next step is to get some content in there …

Table of Contents