Conversion de CSV en tableau HTML en Python

Question

J'essaie de prendre des données d'un fichier .csv et de les importer dans un tableau HTML au sein de python.

Ceci est le fichier csv https://www.mediafire.com/?mootyaa33bmijiq

Le contexte:
Le csv contient les données d’une équipe de football [Groupe d’âge, Tour, Opposition, Score par équipe, Score de l’opposition, Lieu]. Je dois pouvoir sélectionner un groupe d'âge spécifique et afficher uniquement ces détails dans des tableaux distincts.

C'est tout ce que j'ai jusqu'ici ....

infile = open("Crushers.csv","r") for line in infile: row = line.split(",") age = row[0] week = row [1] opp = row[2] ACscr = row[3] OPPscr = row[4] location = row[5] if age == 'U12': print(week, opp, ACscr, OPPscr, location)

John Gordon · Accepted Answer

Avant de commencer à imprimer les lignes souhaitées, générez du HTML pour configurer une structure de tableau appropriée.

Lorsque vous trouvez une ligne à imprimer, imprimez-la au format HTML.

# begin the table print("<table>") # column headers print("<th>") print("<td>Week</td>") print("<td>Opp</td>") print("<td>ACscr</td>") print("<td>OPPscr</td>") print("<td>Location</td>") print("</th>") infile = open("Crushers.csv","r") for line in infile: row = line.split(",") age = row[0] week = row [1] opp = row[2] ACscr = row[3] OPPscr = row[4] location = row[5] if age == 'U12': print("<tr>") print("<td>%s</td>" % week) print("<td>%s</td>" % opp) print("<td>%s</td>" % ACscr) print("<td>%s</td>" % OPPscr) print("<td>%s</td>" % location) print("</tr>") # end the table print("</table>")

Nabil Bennani · Answer

Première installation de pandas:

pip install pandas

Puis lancez:

import pandas as pd columns = ['age', 'week', 'opp', 'ACscr', 'OPPscr', 'location'] df = pd.read_csv('Crushers.csv', names=columns) # This you can change it to whatever you want to get age_15 = df[df['age'] == 'U15'] # Other examples: bye = df[df['opp'] == 'Bye'] crushed_team = df[df['ACscr'] == '0'] crushed_visitor = df[df['OPPscr'] == '0'] # Play with this # Use the .to_html() to get your table in html print(crushed_visitor.to_html())

Vous obtiendrez quelque chose comme:

<table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>age</th> <th>week</th> <th>opp</th> <th>ACscr</th> <th>OPPscr</th> <th>location</th> </tr> </thead> <tbody> <tr> <th>34</th> <td>U17</td> <td>1</td> <td>Banyo</td> <td>52</td> <td>0</td> <td>Home</td> </tr> <tr> <th>40</th> <td>U17</td> <td>7</td> <td>Aspley</td> <td>62</td> <td>0</td> <td>Home</td> </tr> <tr> <th>91</th> <td>U12</td> <td>7</td> <td>Rochedale</td> <td>8</td> <td>0</td> <td>Home</td> </tr> </tbody> </table>

Messa · Answer

D'abord quelques importations:

import csv from html import escape import io

Maintenant les blocs de construction - créons une fonction pour lire le CSV et une autre pour créer le tableau HTML:

def read_csv(path, column_names): with open(path, newline='') as f: # why newline='': see footnote at the end of https://docs.python.org/3/library/csv.html reader = csv.reader(f) for row in reader: record = {name: value for name, value in Zip(column_names, row)} yield record def html_table(records): # records is expected to be a list of dicts column_names = [] # first detect all posible keys (field names) that are present in records for record in records: for name in record.keys(): if name not in column_names: column_names.append(name) # create the HTML line by line lines = [] lines.append('<table>
') lines.append(' <tr>
') for name in column_names: lines.append(' <th>{}</th>
'.format(escape(name))) lines.append(' </tr>
') for record in records: lines.append(' <tr>
') for name in column_names: value = record.get(name, '') lines.append(' <td>{}</td>
'.format(escape(value))) lines.append(' </tr>
') lines.append('</table>') # join the lines to a single string and return it return ''.join(lines)

Maintenant, rassemblez-le :)

records = list(read_csv('Crushers.csv', 'age week opp ACscr OPPscr location'.split())) # Print first record to see whether we are loading correctly print(records[0]) # Output: # {'age': 'U13', 'week': '1', 'opp': 'Waterford', 'ACscr': '22', 'OPPscr': '36', 'location': 'Home'} records = [r for r in records if r['age'] == 'U12'] print(html_table(records)) # Output: # <table> # <tr> # <th>age</th> # <th>week</th> # <th>opp</th> # <th>ACscr</th> # <th>OPPscr</th> # <th>location</th> # </tr> # <tr> # <td>U12</td> # <td>1</td> # <td>Waterford</td> # <td>0</td> # <td>4</td> # <td>Home</td> # </tr> # <tr> # <td>U12</td> # <td>2</td> # <td>North Lakes</td> # <td>12</td> # <td>18</td> # <td>Away</td> # </tr> # ... # </table>

Quelques notes:

csv.reader fonctionne mieux que le fractionnement de lignes car il gère également les valeurs entre guillemets et même les valeurs entre guillemets
html.escape est utilisé pour échapper des chaînes pouvant potentiellement contenir le caractère < ou >
il est souvent plus facile de travailler avec des dict que des tuples
généralement, les fichiers CSV contiennent un en-tête (première ligne avec les noms de colonne) et peuvent être facilement chargés à l’aide de csv.DictReader ; mais le Crushers.csv n'a pas d'en-tête (les données commencent à la toute première ligne), donc nous construisons nous-mêmes les dicts dans la fonction read_csv
les deux fonctions read_csv et html_table sont généralisées afin de pouvoir travailler avec toutes les données, les noms de colonne ne sont pas "codés en dur"
oui, vous pouvez utiliser pandas read_csv et to_html à la place :) Mais il est bon de savoir comment le faire sans pandas au cas où vous auriez besoin de personnalisation. Ou juste comme un exercice de programmation.