comment convertir xls en xlsx

Question

J'ai des fichiers * .xls (Excel 2003) et je souhaite convertir ces fichiers en xlsx (Excel 2007).

J'utilise le package uno python, lorsque j'enregistre les documents, Je peux définir le nom du filtre: MS Excel 97 Mais il n'y a pas de nom de filtre comme 'MS Excel 2007',

aidez-moi s'il vous plaît, comment peut définir le nom du filtre pour convertir xls en xlsx?

Ray · Answer

J'ai déjà dû faire ça avant. L'idée principale est d'utiliser le module xlrd pour ouvrir et analyser un fichier xls et d'écrire le contenu Dans un fichier xlsx à l'aide du module openpyxl .

Voici mon code. Attention! Il ne peut pas gérer des fichiers xls complexes, vous devez ajouter votre propre logique d'analyse si vous allez l'utiliser.

import xlrd from openpyxl.workbook import Workbook from openpyxl.reader.Excel import load_workbook, InvalidFileException def open_xls_as_xlsx(filename): # first open using xlrd book = xlrd.open_workbook(filename) index = 0 nrows, ncols = 0, 0 while nrows * ncols == 0: sheet = book.sheet_by_index(index) nrows = sheet.nrows ncols = sheet.ncols index += 1 # prepare a xlsx sheet book1 = Workbook() sheet1 = book1.get_active_sheet() for row in xrange(0, nrows): for col in xrange(0, ncols): sheet1.cell(row=row, column=col).value = sheet.cell_value(row, col) return book1

chfw · Answer

Voici ma solution, sans tenir compte des polices, des graphiques et des images:

$ pip install pyexcel pyexcel-xls pyexcel-xlsx

Alors fais ceci ::

import pyexcel as p p.save_book_as(file_name='your-file-in.xls', dest_file_name='your-new-file-out.xlsx')

Si vous n’avez pas besoin d’un programme, vous pouvez installer un paquet supplémentaire pyexcel-cli ::

$ pip install pyexcel-cli $ pyexcel transcode your-file-in.xls your-new-file-out.xlsx

La procédure de transcodage ci-dessus utilise xlrd et openpyxl.

kvdogan · Answer

Win32com doit être installé sur votre ordinateur. Voici mon code:

import win32com.client as win32 fname = "full+path+to+xls_file" Excel = win32.gencache.EnsureDispatch('Excel.Application') wb = Excel.Workbooks.Open(fname) wb.SaveAs(fname+"x", FileFormat = 51) #FileFormat = 51 is for .xlsx extension wb.Close() #FileFormat = 56 is for .xls extension Excel.Application.Quit()

Jackypengyu · Answer

Je n'ai trouvé aucune réponse ici à 100%. Alors je poste mes codes ici:

import xlrd from openpyxl.workbook import Workbook def cvt_xls_to_xlsx(src_file_path, dst_file_path): book_xls = xlrd.open_workbook(src_file_path) book_xlsx = Workbook() sheet_names = book_xls.sheet_names() for sheet_index in range(0,len(sheet_names)): sheet_xls = book_xls.sheet_by_name(sheet_names[sheet_index]) if sheet_index == 0: sheet_xlsx = book_xlsx.active() sheet_xlsx.title = sheet_names[sheet_index] else: sheet_xlsx = book_xlsx.create_sheet(title=sheet_names[sheet_index]) for row in range(0, sheet_xls.nrows): for col in range(0, sheet_xls.ncols): sheet_xlsx.cell(row = row+1 , column = col+1).value = sheet_xls.cell_value(row, col) book_xlsx.save(dst_file_path)

Malexandre · Answer

La réponse de Ray m'a beaucoup aidé, mais pour ceux qui recherchent un moyen simple de convertir toutes les feuilles d'un xls en un xlsx, j'ai fait ceci Gist :

import xlrd from openpyxl.workbook import Workbook as openpyxlWorkbook # content is a string containing the file. For example the result of an http.request(url). # You can also use a filepath by calling "xlrd.open_workbook(filepath)". xlsBook = xlrd.open_workbook(file_contents=content) workbook = openpyxlWorkbook() for i in xrange(0, xlsBook.nsheets): xlsSheet = xlsBook.sheet_by_index(i) sheet = workbook.active if i == 0 else workbook.create_sheet() sheet.title = xlsSheet.name for row in xrange(0, xlsSheet.nrows): for col in xrange(0, xlsSheet.ncols): sheet.cell(row=row, column=col).value = xlsSheet.cell_value(row, col) # The new xlsx file is in "workbook", without iterators (iter_rows). # For iteration, use "for row in worksheet.rows:". # For range iteration, use "for row in worksheet.range("{}:{}".format(startCell, endCell)):".

Vous pouvez trouver le xlrd lib ici et l’openpyxl ici (vous devez télécharger xlrd dans votre projet pour Google App Engine, par exemple).

Jhon Anderson · Answer

Je suis améliorer les performances pour la méthode @Jackypengyu.

Les cellules fusionnées seront également converties.

Résultats

Convertissez les 12 mêmes fichiers dans le même ordre:

Original:

0:00:01.958159 0:00:02.115891 0:00:02.018643 0:00:02.057803 0:00:01.267079 0:00:01.308073 0:00:01.245989 0:00:01.289295 0:00:01.273805 0:00:01.276003 0:00:01.293834 0:00:01.261401

Amélioré:

0:00:00.774101 0:00:00.734749 0:00:00.741434 0:00:00.744491 0:00:00.320796 0:00:00.279045 0:00:00.315829 0:00:00.280769 0:00:00.316380 0:00:00.289196 0:00:00.347819 0:00:00.284242

Solution

def cvt_xls_to_xlsx(*args, **kw): """Open and convert XLS file to openpyxl.workbook.Workbook object @param args: args for xlrd.open_workbook @param kw: kwargs for xlrd.open_workbook @return: openpyxl.workbook.Workbook You need -> from openpyxl.utils.cell import get_column_letter """ book_xls = xlrd.open_workbook(*args, formatting_info=True, ragged_rows=True, **kw) book_xlsx = Workbook() sheet_names = book_xls.sheet_names() for sheet_index in range(len(sheet_names)): sheet_xls = book_xls.sheet_by_name(sheet_names[sheet_index]) if sheet_index == 0: sheet_xlsx = book_xlsx.active sheet_xlsx.title = sheet_names[sheet_index] else: sheet_xlsx = book_xlsx.create_sheet(title=sheet_names[sheet_index]) for crange in sheet_xls.merged_cells: rlo, rhi, clo, chi = crange sheet_xlsx.merge_cells( start_row=rlo + 1, end_row=rhi, start_column=clo + 1, end_column=chi, ) def _get_xlrd_cell_value(cell): value = cell.value if cell.ctype == xlrd.XL_CELL_DATE: value = datetime.datetime(*xlrd.xldate_as_Tuple(value, 0)) return value for row in range(sheet_xls.nrows): sheet_xlsx.append(( _get_xlrd_cell_value(cell) for cell in sheet_xls.row_slice(row, end_colx=sheet_xls.row_len(row)) )) for rowx in range(sheet_xls.nrows): if sheet_xls.rowinfo_map[rowx].hidden != 0: print sheet_names[sheet_index], rowx sheet_xlsx.row_dimensions[rowx+1].hidden = True for coly in range(sheet_xls.ncols): if sheet_xls.colinfo_map[coly].hidden != 0: print sheet_names[sheet_index], coly coly_letter = get_column_letter(coly+1) sheet_xlsx.column_dimensions[coly_letter].hidden = True return book_xlsx

Atais · Answer

Solution simple

J'avais besoin d'une solution simple pour convertir quelques formats de xlx en xlsx. Il y a beaucoup de réponses ici, mais ils font une "magie" que je ne comprends pas complètement.

Une solution simple a été donnée par chfw , mais pas tout à fait complète.

Installer des dépendances

Utilisez pip pour installer

pip install pyexcel-cli pyexcel-xls pyexcel-xlsx

Execute

Tout le style et les macros auront disparu, mais les informations sont intactes.

Pour fichier unique

pyexcel transcode your-file-in.xls your-new-file-out.xlsx

Pour tous les fichiers du dossier, une doublure

for file in *.xls; do; echo "Transcoding $file"; pyexcel transcode "$file" "${file}x"; done;

lordwilliamsr · Answer

CONVERTIR UN FICHIER XLS EN XLSX

En utilisant python3.6 Je viens juste de découvrir le même problème et après des heures de lutte, je l'ai résolu en faisant la ff, vous n'aurez probablement pas besoin de tous les paquets:

assurez-vous d'installer les paquets suivants avant de continuer

pip installer pyexcel, pip installer pyexcel-xls, pip installer pyexcel-xlsx,

pip installer pyexcel-cli

étape 1:

import pyexcel

étape 2: "exemple.xls", "exemple.xlsx", "exemple.xlsm"

sheet0 = pyexcel.get_sheet(file_name="your_file_path.xls", name_columns_by_row=0)

step3: créer un tableau à partir du contenu

xlsarray = sheet.to_array()

step4: vérifier le contenu de la variable pour vérifier

xlsarray

step5: passer le tableau contenu dans la variable appelée (xlsarray) à une nouvelle variable de classeur appelée (sheet1)

sheet1 = pyexcel.Sheet(xlsarray)

step6: enregistrer la nouvelle feuille se terminant par .xlsx (dans mon cas, je veux xlsx)

sheet1.save_as("test.xlsx")

benmichae2. · Answer

La réponse de Ray coupait la première ligne et la dernière colonne des données. Voici ma solution modifiée (pour python3):

def open_xls_as_xlsx(filename): # first open using xlrd book = xlrd.open_workbook(filename) index = 0 nrows, ncols = 0, 0 while nrows * ncols == 0: sheet = book.sheet_by_index(index) nrows = sheet.nrows+1 #bm added +1 ncols = sheet.ncols+1 #bm added +1 index += 1 # prepare a xlsx sheet book1 = Workbook() sheet1 = book1.get_active_sheet() for row in range(1, nrows): for col in range(1, ncols): sheet1.cell(row=row, column=col).value = sheet.cell_value(row-1, col-1) #bm added -1's return book1

CakeL · Answer

J'ai essayé la solution de @Jhon Anderson, fonctionne bien, mais l'erreur "année est hors de portée" quand il y a des cellules de format d'heure comme HH: mm: ss sans date. Là, j'ai encore amélioré l'algorithme:

def xls_to_xlsx(*args, **kw): """ open and convert an XLS file to openpyxl.workbook.Workbook ---------- @param args: args for xlrd.open_workbook @param kw: kwargs for xlrd.open_workbook @return: openpyxl.workbook.Workbook对象 """ book_xls = xlrd.open_workbook(*args, formatting_info=True, ragged_rows=True, **kw) book_xlsx = openpyxl.workbook.Workbook() sheet_names = book_xls.sheet_names() for sheet_index in range(len(sheet_names)): sheet_xls = book_xls.sheet_by_name(sheet_names[sheet_index]) if sheet_index == 0: sheet_xlsx = book_xlsx.active sheet_xlsx.title = sheet_names[sheet_index] else: sheet_xlsx = book_xlsx.create_sheet(title=sheet_names[sheet_index]) for crange in sheet_xls.merged_cells: rlo, rhi, clo, chi = crange sheet_xlsx.merge_cells(start_row=rlo + 1, end_row=rhi, start_column=clo + 1, end_column=chi,) def _get_xlrd_cell_value(cell): value = cell.value if cell.ctype == xlrd.XL_CELL_DATE: datetime_tup = xlrd.xldate_as_Tuple(value,0) if datetime_tup[0:3] == (0, 0, 0): # time format without date value = datetime.time(*datetime_tup[3:]) else: value = datetime.datetime(*datetime_tup) return value for row in range(sheet_xls.nrows): sheet_xlsx.append(( _get_xlrd_cell_value(cell) for cell in sheet_xls.row_slice(row, end_colx=sheet_xls.row_len(row)) )) return book_xlsx

Alors travail parfait!