Analyser des fichiers (ics / icalendar) en utilisant Python

Question

J'ai un fichier .ics au format suivant. Quelle est la meilleure façon de l'analyser? J'ai besoin de récupérer le résumé, la description et l'heure pour chacune des entrées.

BEGIN:VCALENDAR X-Lotus-CHARSET:UTF-8 VERSION:2.0 PRODID:-//Lotus Development Corporation//NONSGML Notes 8.0//EN METHOD:PUBLISH BEGIN:VTIMEZONE TZID:India BEGIN:STANDARD DTSTART:19500101T020000 TZOFFSETFROM:+0530 TZOFFSETTO:+0530 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTART;TZID="India":20100615T111500 DTEND;TZID="India":20100615T121500 TRANSP:OPAQUE DTSTAMP:20100713T071035Z CLASS:PUBLIC DESCRIPTION:Emails
Darlene
 Murphy
Dr. Ferri
 UID:12D3901F0AD9E83E65257743001F2C9A-Lotus_Notes_Generated X-Lotus-UPDATE-SEQ:1 X-Lotus-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1 X-Lotus-NOTESVERSION:2 X-Lotus-APPTTYPE:0 X-Lotus-CHILD_UID:12D3901F0AD9E83E65257743001F2C9A END:VEVENT BEGIN:VEVENT DTSTART;TZID="India":20100628T130000 DTEND;TZID="India":20100628T133000 TRANSP:OPAQUE DTSTAMP:20100628T055408Z CLASS:PUBLIC DESCRIPTION: SUMMARY:smart energy management LOCATION:8778/92050462 UID:07F96A3F1C9547366525775000203D96-Lotus_Notes_Generated X-Lotus-UPDATE-SEQ:1 X-Lotus-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1 X-Lotus-NOTESVERSION:2 X-Lotus-NOTICETYPE:A X-Lotus-APPTTYPE:3 X-Lotus-CHILD_UID:07F96A3F1C9547366525775000203D96 END:VEVENT BEGIN:VEVENT DTSTART;TZID="India":20100629T110000 DTEND;TZID="India":20100629T120000 TRANSP:OPAQUE DTSTAMP:20100713T071037Z CLASS:PUBLIC SUMMARY:meeting UID:6011DDDD659E49D765257751001D2B4B-Lotus_Notes_Generated X-Lotus-UPDATE-SEQ:1 X-Lotus-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1 X-Lotus-NOTESVERSION:2 X-Lotus-APPTTYPE:0 X-Lotus-CHILD_UID:6011DDDD659E49D765257751001D2B4B END:VEVENT

Wok · Answer

Le package icalendar a l'air sympa.

Par exemple, pour écrire un fichier:

from icalendar import Calendar, Event from datetime import datetime from pytz import UTC # timezone cal = Calendar() cal.add('prodid', '-//My calendar product//mxm.dk//') cal.add('version', '2.0') event = Event() event.add('summary', 'Python meeting about calendaring') event.add('dtstart', datetime(2005,4,4,8,0,0,tzinfo=UTC)) event.add('dtend', datetime(2005,4,4,10,0,0,tzinfo=UTC)) event.add('dtstamp', datetime(2005,4,4,0,10,0,tzinfo=UTC)) event['uid'] = '20050115T101010/27346262376@mxm.dk' event.add('priority', 5) cal.add_component(event) f = open('example.ics', 'wb') f.write(cal.to_ical()) f.close()

Tadaaa, vous obtenez ce fichier:

BEGIN:VCALENDAR PRODID:-//My calendar product//mxm.dk// VERSION:2.0 BEGIN:VEVENT DTEND;VALUE=DATE:20050404T100000Z DTSTAMP;VALUE=DATE:20050404T001000Z DTSTART;VALUE=DATE:20050404T080000Z PRIORITY:5 SUMMARY:Python meeting about calendaring UID:20050115T101010/27346262376@mxm.dk END:VEVENT END:VCALENDAR

Mais qu'est-ce qu'il y a dans ce dossier?

g = open('example.ics','rb') gcal = Calendar.from_ical(g.read()) for component in gcal.walk(): print component.name g.close()

Vous pouvez le voir facilement:

>>> VCALENDAR VEVENT >>>

Qu'en est-il de l'analyse des données sur les événements:

g = open('example.ics','rb') gcal = Calendar.from_ical(g.read()) for component in gcal.walk(): if component.name == "VEVENT": print(component.get('summary')) print(component.get('dtstart')) print(component.get('dtend')) print(component.get('dtstamp')) g.close()

Vous obtenez maintenant:

>>> Python meeting about calendaring 20050404T080000Z 20050404T100000Z 20050404T001000Z >>>

Brad Montgomery · Answer

Vous pourriez probablement aussi utiliser le module vobject pour cela: http://pypi.python.org/pypi/vobject

Si tu as un sample.ics fichier, vous pouvez lire son contenu, donc:

# read the data from the file data = open("sample.ics").read() # parse the top-level event with vobject cal = vobject.readOne(data) # Get Summary print 'Summary: ', cal.vevent.summary.valueRepr() # Get Description print 'Description: ', cal.vevent.description.valueRepr() # Get Time print 'Time (as a datetime object): ', cal.vevent.dtstart.value print 'Time (as a string): ', cal.vevent.dtstart.valueRepr()

Wayne Werner · Answer

Quatre ans plus tard et en comprenant ICS est un peu meilleur, si c'était les champs seulement dont j'avais besoin, j'utiliserais simplement les méthodes de chaîne natives:

import io # Probably not a valid .ics file, but we don't really care for the example # it works fine regardless file = io.StringIO(''' BEGIN:VCALENDAR X-Lotus-CHARSET:UTF-8 VERSION:2.0 DESCRIPTION:Emails
Darlene
 Murphy
Dr. Ferri
 SUMMARY:smart energy management LOCATION:8778/92050462 DTSTART;TZID="India":20100629T110000 DTEND;TZID="India":20100629T120000 TRANSP:OPAQUE DTSTAMP:20100713T071037Z CLASS:PUBLIC SUMMARY:meeting UID:6011DDDD659E49D765257751001D2B4B-Lotus_Notes_Generated X-Lotus-UPDATE-SEQ:1 X-Lotus-UPDATE-WISL:$S:1;$L:1;$B:1;$R:1;$E:1;$W:1;$O:1;$M:1 X-Lotus-NOTESVERSION:2 X-Lotus-APPTTYPE:0 X-Lotus-CHILD_UID:6011DDDD659E49D765257751001D2B4B END:VEVENT '''.strip()) parsing = False for line in file: field, _, data = line.partition(':') if field in ('SUMMARY', 'DESCRIPTION', 'DTSTAMP'): parsing = True print(field) print('	'+'
	'.join(data.split('
'))) Elif parsing and not data: print('	'+'
	'.join(field.split('
'))) else: parsing = False

Le stockage des données et l'analyse de la date/heure sont laissés comme un exercice pour le lecteur (c'est toujours UTC)

ancienne réponse ci-dessous

Vous pouvez utiliser une expression régulière:

import re text = #your text print(re.search("SUMMARY:.*?:", text, re.DOTALL).group()) print(re.search("DESCRIPTION:.*?:", text, re.DOTALL).group()) print(re.search("DTSTAMP:.*:?", text, re.DOTALL).group())

Je suis sûr qu'il peut être possible de sauter le premier et le dernier mot, je ne sais pas comment le faire avec regex. Vous pouvez cependant le faire de cette façon:

print(' '.join(re.search("SUMMARY:.*?:", text, re.DOTALL).group().replace(':', ' ').split()[1:-1])

cibsbui · Answer

Nouveau sur python; les commentaires ci-dessus ont été très utiles, je voulais donc publier un échantillon plus complet.

# ics to csv example # dependency: https://pypi.org/project/vobject/ import vobject import csv with open('sample.csv', mode='w') as csv_out: csv_writer = csv.writer(csv_out, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL) csv_writer.writerow(['WHAT', 'WHO', 'FROM', 'TO', 'DESCRIPTION']) # read the data from the file data = open("sample.ics").read() # iterate through the contents for cal in vobject.readComponents(data): for component in cal.components(): if component.name == "VEVENT": # write to csv csv_writer.writerow([component.summary.valueRepr(),component.attendee.valueRepr(),component.dtstart.valueRepr(),component.dtend.valueRepr(),component.description.valueRepr()])