Редактирование надписи в powerpoint python через win32com client
Как Python управляет офисом для автоматизации? --- приложение win32com.client
Фон приложения
В работе, из-за необходимости регулярно отчитываться, нужен офис. В основном используется форма Excel, а затем отчеты по электронной почте отправляются каждой команде или руководителю. Здесь много повторяющейся работы. Часто существует фиксированный шаблон для отчета о работе, и каждый раз вам нужно только вручную импортировать соответствующие данные. Если эти повторяющиеся действия автоматизированы, вы, несомненно, можете сохранить Сделай много усилий. Поэтому я подумал об использовании Python для достижения автоматической генерации форм. Сегодняшнее введение - это только часть, в основном автоматическое изменение и генерация форм Excel.
Кажется, это очень простая задача. Вы можете вызывать часто используемые модули Python, связанные с Excel, xlrd, xlwd или openpyxl. Да, эти простые операции над таблицами Excel - 666, но в моей таблице есть сводная таблица. Это сделало меня очень болезненным, и внезапно обнаружил, что три вышеупомянутых модуля были не просты в использовании, и они использовались. Я также обнаружил, что openpyxl, похоже, не имеет функции для непосредственного удаления строк. Когда вы копируете предыдущую таблицу в качестве базовой таблицы, вы меняете некоторые После того, как содержимое было сохранено, сводная таблица исчезла, и мое сердце было очень разбито. Я видел много постов в Google и Baidu, и не было никакого хорошего пути. Наконец, я увидел пост в stackoverflow и использовал win32com.client. Сначала я этого не понимал, многие функции внутри не знали, откуда он взялся, и документации не было.
Наконец, я обнаружил, что win32com.client может напрямую вызывать библиотеку VBA, которая является очень мощной. VBA включает в себя функцию записи макросов. Вручную напрямую работает Excel для записи, вы можете найти соответствующую функцию и затем вызвать ее. Функции реализованы.
примеров
Сначала посмотрите на таблицу случаев:
Здесь представлена только часть реализации функции, которая также является основной частью. Остальные операции с листовой страницей аналогичны. Здесь мы представим методы, используемые некоторыми модулями.
Первая таблица, которая в основном должна загружать требуемые Случаи из внутренней общей папки, является файлом типа документа, и их нужно записать в столбец AllCases в Таблице 1. Здесь относительно просто, нам нужно использовать только open и readlines (), это может быть записано в таблицу Excel путем обхода. Затем в таблице 2 обновите сводную таблицу. Ниже, я перехватываю часть кода, путь может сам вводить ввод, я буду представлять его в блоках.
Перебор содержимого файла
Пройдите по файлу и запишите случаи. Код выглядит следующим образом. Используйте модуль win32com.client (self.excel), чтобы открыть файл формы (self.filepath), который нам нужно изменить. Используйте wb.Worksheets («AllCases»), чтобы открыть страницу листа «AllCases». Примечание W of Worksheets - это прописные буквы. Не забудьте добавить s, прочитать каждую строку с помощью f.readlines () и записать каждую строку в таблицу с помощью for in. Здесь Range ('A1') представляет ячейку A1. Плюс. Значение это его стоимость.
Мощный win32com.client
Функциональный модуль VBA можно вызвать следующим образом: если вы хотите использовать слово, измените его на «Word.Applicaiton». Первый - включить Excel, а второй - вызвать некоторые специфичные для VBA переменные, например, вызвать свойство и добавить его непосредственно перед ним.
Используя это, мы можем вызывать некоторые функциональные модули VBA, например функцию удаления строк.В openpyxl я не нашел функцию непосредственного удаления строк, а win32com может удалять строки следующими способами:
Здесь следует отметить, что Delete или многие функции в VBA вызываются без скобок, для их использования нам нужно добавить скобки в Python.
Ниже приведен формат для заполнения предыдущей строки, которая является нашей обычной выпадающей копией:
Функция сводной таблицы:
Эти функции, вам не нужно идти в Интернет, чтобы найти их, вы можете непосредственно открыть форму Excel, есть макрос под видом, используя макрос записи, а затем вручную обновить сводную таблицу, остановить запись, просмотреть макрос вы можете увидеть код для обновления сводной таблицы, вы можете Скопируйте его напрямую, измените соответствующие параметры и не забудьте добавить скобки.В приведенном выше примере PivotCache () нельзя использовать без скобок.
Другие функции могут быть вызваны в соответствии с вашими потребностями.
Обратите внимание
Использование win32com.client позволяет более плавно обрабатывать различные сложные ситуации Excel или Word. По сравнению с openpyxl, xlrd, xlwd это будет более полным. Тем не менее, есть некоторые моменты, которые следует отметить во время использования:
As I've made slight progress with the problem at hand since i posted this question, I've found it necessary to split it in two parts to maintain clarity.
- How can I manipulate shape colors in PowerPoint using Python and win32com.client?
- How can I inspect com objects in Python using dir() ?
There are some examples on how to edit PowerPoint slides using the pptx library here. However, I find it much easier to manipulate an active PowerPoint presentation using win32com.client as described here. Using an example from Microsoft Developer Network I've found that I can easily replicate parts of the functionality of this VBA snippet.
. with this Python snippet:
Here I'm able to manipulate font size and name. I can also change the orientation of the textbox by changing Orientation=0x1 to Orientation=0x5 in shape1 = slide.Shapes.AddTextbox(Orientation=0x1,Left=100,Top=100,Width=100,Height=100) .
What seems impossible though, is editing the box or font color.
This does not work:
But I'm having trouble here as well with pip install colour :
By now I'm a bit lost on all accounts, so any hints to ANY way of manipulating colors would be great!
In my attempts to manage those pesky colors, I started inspecting the output from dir(shape1.TextFrame) , dir(shape1.TextFrame.Textrange) and so on. To my disappointment, I could not find anything about colors, and not even Font, although Font is clearly accessible for manipulation.
So my second question is this: Is this not the way to inspect and manipulate these shapes at all? And how could I find the right object (or method?) to manipulate shape1 further? I have had a look at the PowerPoint objectmodel, but with little success.
This post explains how to use Python and COM to automate PowerPoint. COM provides a much deeper level of control than any of the Python modules that exist at this time. It exposes the entire VBA API to Python, allowing us to use Python’s superior pretty much everything (my opinion) to take advantage of VBA’s deep integration with MS Office.
This process can be used with any MS Office software that supports VBA. So, you could automate Word, Excel, Access, … etc. in the same ways.
Before working with COM, we need to find a way to interface over COM. To do this we can use the Python package pywin32 to provide us with useful functions and an easy way to talk to a COM32 program. In this guide I am using Python 3.
In order to control an application, we first need to create a connection to the application itself. This command opens a COM32 connection to PowerPoint that we can send commands over.
The ‘app’ object is now our entry point to the PowerPoint Object controls. We can now create an object that contains an open .pptx by using a bit of VBA-esque code.
We now have an object, ‘ppt’, that contains another object, ‘objCOM’, that is a direct handle on the ‘presentation’ object level in VBA. We can now use objCOM as a direct replacement for a VBA object such as ‘activePresentation’. So,
self.objCOM.SlideShowWindow.View.Next() (python)
if used within the ppt class or
inst_name.objCOM.SlideShowWindow.View.Next() (python)
if used by an external method.
Using the ideas from above, we can set, read, or use almost any method or variable made available to us through the VBA object library. We can utilize the Microsoft Office Object Reference Library to discover usable properties and functions. This API is available for all MS Office apps. Here are a few examples of available objects using the PowerPoint object from earlier.
Creating a New SlideShow (creating an object)
self.objCom.Presentations.Add
Advancing a Slide (controlling an object)
self.objCOM.SlideShowWindow.View.Next()
This advances the SlideShow to the next unhidden slide.
Creating a Slide (creating a sub-object)
self.objCOM.Slides.Add Index:=ActivePresentation.Slides.Count + 1
The Index is where in the slide show you want to add the slide.
Setting The SlideShow to Advance Manually Only (setting a property)
self.objCOM.SlideShowSettings.AdvanceMode = ppAdvanceOnClick
'ppAdvanceOnClick' is only available if you import the Microsoft PowerPoint 12.0 Object Library constants in your code. Otherwise, you need to convert these into their hex value equivalents.
I suggest using from constants_file import * to import them as it doesn't use the imported file namespace and is more VBA-esque.
Reading Current Slide Index (reading a property)
self.objCOM.SlideShowWindow.View.Slide.SlideIndex
The SlideIndex is a unique slide identifier that can be used to refer to any slide in the SlideShow. Each slide has one (including hidden slides) and they are incremented in order of the slides in the SlideShow. eg. 1, 2, 3, .
Closing the application is optional as the process will be closed automatically when all references to it have been deleted. However, it is good practice to close the reference at the end of your program or when you are finished with all PowerPoint objects. You can request that the object close from the application level by using app.Quit() .
Writing a program to draw or change slides is sometimes easier than doing it manually. To change all fonts on a presentation to Arial, for example, you’d write this Visual Basic macro:
If you didn’t like Visual Basic, though, you could write the same thing in Python:
Save this as arial.py and type “arial.py some.ppt” to convert some.ppt into Arial.
Let’s break that down a bit. import win32com.client lets you interact with Windows using COM. You need ActivePython to do this. Now you can launch PowerPoint with
The Application object you get here is the same Application object you’d use in Visual Basic. That’s pretty powerful. What that means is, to a good extent, you can copy and paste Visual Basic code into Python and expect it to work with minor tweaks for language syntax, just please make sure to learn how to update python before doing anything else.
So let’s try to do something with this. First, let’s open PowerPoint and add a blank slide.
That 12 is the code for a blank slide. In Visual Basic, you’d instead say:
To do this in Python, run Python/Lib/site-packages/win32com/client/makepy.py and pick “Microsoft Office 12.0 Object Library” and “Microsoft PowerPoint 12.0 Object Library”. (If you have a version of Office other than 12.0, pick your version.)
This creates two Python files. I rename these files as MSO.py and MSPPT.py and do this:
This makes constants like ppLayoutBlank , msoShapeRectangle , etc. available. So now I can create a blank slide and add a rectangle Python just like in Visual Basic:
Incidentally, the dimensions are in points (1/72″). Since the default presentation is 10″ x 7.5″ the size of each page is 720 x 540.
Let’s do something that you’d have trouble doing manually in PowerPoint: a Treemap. The Guardian’s data store kindly makes available the top 50 banks by assets that we’ll use for this example. Our target output is a simple Treemap visualisation.
We’ll start by creating a blank slide. The code is as before.
I created a simple Treemap class based on the squarified algorithm — you can play with the source code. This Treemap class can be fed the data in the format we have, and a draw function. The draw function takes (x, y, width, height, data_item) as parameters, where data_item is a row in the data list that we pass to it.
Try running the source code. You should have a single slide in PowerPoint like this.
The beauty of using PowerPoint as the output format is that converting this into a cushioned Treemap with gradients like below (or changing colours, for that matter), is a simple interactive process.
Step by step tutorial to edit PowerPoint slides using Python
After the two articles about using Microsoft Excel smarter, I have received a few direct messages saying that they are interested in this area and hope I could share more on other office products. It stimulates me to make a collection series of articles related to office tips. And the first one is Microsoft PowerPoint. In this article, you would learn about
- how to determine the shapes in the PowerPoint slides.
- modifying the slide such as inserting images or changing words.
- the methods to output the slides in different formats such as PNG or PDF.
Also, if you have not yet checked my articles about Microsoft Excel, feel free to check the links below.
Use Python to Stylize the Excel Formatting
Step by step tutorial to format the Excel spreadsheet using Python
Use Excel to Scrape Data (NO CODES REQUIRED)
Three simple steps to scrape data with Excel built-in function
In many people daily work, they have to frequently and regularly update the figures in the PowerPoint slides such as the table figures, date, KPI statistics, etc. It would be quite annoying to spend 15–30 minutes every day just to handle these kinds of tedious works.
In the following example, we are assuming to have a regular update on the currency exchange prices to pop up the currency exchange rates that are having the largest changes and we are required to show the trending of the prices for the one with the largest percentage changes.
All codes and materials are uploaded to my GitHub. You can check and folk this repo to further study. =)
First things first, we have to decide the elements that we are going to update. Take my slide as an example, there are in total 7 places needed to update.
- Last update date time
- Top 5 table
- Bottom 5 table
- Top figure label
- Bottom figure label
- Top figure
- Bottom figure
After understanding the components to update, what’s next would be the data source to be used for updating. In general office work, the data can be extracted from the SQL server or received the data file from the email, or etc. Here, we would demonstrate the case that the data are scraped from the Internet. We will use the Yahoo Finance data as an illustration.
I am not going too deep about how data are scraped since this is not the main focus of this article. Basically, it’s just two lines of code. First, we get the page using requests . Then, we extract the table using pandas . Please note that we have also recorded the scraping date-time for later usage.
The table looks good and consists of everything we needed. We can now go to the next move which is to sort the tables and get the top 5 and bottom 5 currencies exchange rates.
The figures are all ready. The only things left would be the two plots. To generate the plot, you only need to follow the below codes. Don’t worry! I will explain step-by-step and it would be pretty easy to follow.
- Get the data for the currencies exchange rate.
- Extract the closing pricing and plot in a line chart.
- Format the plots such as colouring, font size or transparent background so as to align with the PPT theme.
- Save the plots into PNG.
1. Get the data
The price series of data for a particular currency exchange rate can be found in the following link. There is only one parameter which is the name of that currency exchange rate.
Please note that this time you need to specify a header so as to successfully get the page data. The header is used to pretend you are visiting the page through a browser.
The rest should be easy to understand which is the common procedure to get the page data and load as JSON format.
2. Plot the line chart
To get the price list, we just have to check the dictionary structure of the data and you can get it. Just one thing to remind is that I have a checking here to remove those None data in the price list since I found there are some missing data in the list.
We use matplotlib to plot the line.
3. Stylize the chart
There are a few things I have made to polish the plots.
- Remove the x-axis ticks for the Date
- Change the font size and font color for the y-axis ticks
- Change the border color and make the line width larger
4. Save the plot
Finally, we save the figure into PNG format. Please note that to make the background transparent, you only have to specify transparent=True .
It comes to our focus today. Before using Python to edit PowerPoint, you need to have the python-pptx package. To install it, you can type the following code in the terminal.
Just follow our usual practice, I show you all the codes first and then I walk you through them step-by-step.
The code seems so long and complicated. Don’t panic. If you understand the structure, you can handle it easily.
- Specify the slide you are working on.
- Remove the existing plots.
- Add the new plots.
- Define the shape index for the components.
- Update the components one by one.
- Export the PPT into whatever format we want.
1. Load the PPT and specify the slide
From the above code, you can see the number 0 in the bracket. It refers to the slide number. Since I am working on the first slide, I specify the number to be 0 . For example, if your regular update slide is in slide 10, then you have to specify the number as 9 .
2. Remove the old plots
In Step 1, we have defined our 7 components and there are two plots — component 6 and component 7. To replace them, we first have to delete them. Otherwise, the new plots will be overlapped with the old plots. But the problem is ‘how do we specify the particular items in the slide?’
In python-pptx , we have different shapes for different objects. To check the shapes you now have in the slide, you can do the following code.
Take my slide as an example, the below is the output I received.
Then we can figure out the old plots should be PICTURE (13) . To remove them, we just need to check the shape type whether equals 13 or not.
3. Add the new plots
The charts can be easily added with the following function.
add_picture(image_file, left, top, width=None, height=None)
Basically, the thing you have to do is to set the image file path and the x-y location of the image you would like to put just like below.
Please note that one more step to go for the figures is to specify the placing order of the images. By doing so, you have to find a relative shape as the reference shape and use the addnext and addprevious functions to specify their relationships. This is just like the concept of Bring Forward and Send Backward in PowerPoint.
4. Find the shape index
To edit the shape, we have to identify which shape index it refers to. In shorts, we would firstly categorize the shapes into different lists and identify the corresponding shape index based on the x-coordinate (left) and y-coordinate (height) .
Text Box List
There are three shapes for text boxes. To identify the last update date time shape, we can find the one with the largest height.
Auto Shape List
There are only two auto shapes which would be the top label and bottom label. They are having the same height but different x-coordinates. The one with the smallest value in left would be the top label and the other one would be the bottom label.
Table List
Since there are also only two tables, we can simply apply a similar trick just like what we did in the auto shape list.
5. Update the components
To update the components, one extremely important rule is to follow changing value inside the shape but not the formatting. In below, I use the last update date time as an example.
Plain Text (Don’t use this one)
Text with Formatting (Use this one)
From the above image, you can tell difference. A paragraph is actually used to store not only the value of the shape but also the formatting information such as the alignment, font, hyperlink and etc. Therefore, remember to change the text value in the paragraph runs but not the shape one.
The labels of the figures are actually having the same structure and I am going to repeat the concept again. Let’s move to the update for the table value.
Basically, it’s pretty straightforward. We find our interested table and change the value inside the paragraph runs for each cell in the table. Please note that i refers to the row record and j refers to the column.
6. Export the files
The tutorial is almost done and the last step would be to transform the file into the format we want. Basically, let me share the most common file structures people would use.
Save as PPT
Save as PNG
Save as PDF
That’s the end of the office tips for using Python to automate the work in PowerPoint. If you are interested to know more about this kind of work tips, do give a like and follow. Stay tuned for my next tutorial. =)
If you find my article useful, please endorse my skills on my LinkedIn page to encourage me to write more articles.
Читайте также: