用Flask处理超大文件上传(1 GB以上)的最佳方法是什么?
我的应用程序实际上需要多个文件,为它们分配一个唯一的文件号,然后根据用户选择的位置将其保存在服务器上。
我们如何将文件上传作为后台任务运行,以使用户在1小时内没有浏览器旋转,而是可以立即进入下一页?
我认为解决该问题的超级简单方法只是将文件分成许多小部分/大块发送。因此,要完成这项工作将需要两个部分,即前端(网站)和后端(服务器)。对于前端部分,你可以使用类似的东西Dropzone.js,它没有附加的依赖关系,并且包含不错的CSS。你所要做的就是将类添加dropzone到表单,它会自动将其变成其特殊的拖放字段之一(你也可以单击并选择)。
Dropzone.js
dropzone
但是,默认情况下,dropzone不会对文件进行分块。幸运的是,它确实很容易启用。下面是一个示例文件上传形式DropzoneJS和chunking启用:
DropzoneJS
chunking
<html lang="en"> <head> <meta charset="UTF-8"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.css"/> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/basic.min.css"/> <script type="application/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.js"> </script> <title>File Dropper</title> </head> <body> <form method="POST" action='/upload' class="dropzone dz-clickable" id="dropper" enctype="multipart/form-data"> </form> <script type="application/javascript"> Dropzone.options.dropper = { paramName: 'file', chunking: true, forceChunking: true, url: '/upload', maxFilesize: 1025, // megabytes chunkSize: 1000000 // bytes } </script> </body> </html>
这是使用flask的后端部分:
import logging import os from flask import render_template, Blueprint, request, make_response from werkzeug.utils import secure_filename from pydrop.config import config blueprint = Blueprint('templated', __name__, template_folder='templates') log = logging.getLogger('pydrop') @blueprint.route('/') @blueprint.route('/index') def index(): # Route to serve the upload form return render_template('index.html', page_name='Main', project_name="pydrop") @blueprint.route('/upload', methods=['POST']) def upload(): file = request.files['file'] save_path = os.path.join(config.data_dir, secure_filename(file.filename)) current_chunk = int(request.form['dzchunkindex']) # If the file already exists it's ok if we are appending to it, # but not if it's new file that would overwrite the existing one if os.path.exists(save_path) and current_chunk == 0: # 400 and 500s will tell dropzone that an error occurred and show an error return make_response(('File already exists', 400)) try: with open(save_path, 'ab') as f: f.seek(int(request.form['dzchunkbyteoffset'])) f.write(file.stream.read()) except OSError: # log.exception will include the traceback so we can see what's wrong log.exception('Could not write to file') return make_response(("Not sure why," " but we couldn't write the file to disk", 500)) total_chunks = int(request.form['dztotalchunkcount']) if current_chunk + 1 == total_chunks: # This was the last chunk, the file should be complete and the size we expect if os.path.getsize(save_path) != int(request.form['dztotalfilesize']): log.error(f"File {file.filename} was completed, " f"but has a size mismatch." f"Was {os.path.getsize(save_path)} but we" f" expected {request.form['dztotalfilesize']} ") return make_response(('Size mismatch', 500)) else: log.info(f'File {file.filename} has been uploaded successfully') else: log.debug(f'Chunk {current_chunk + 1} of {total_chunks} ' f'for file {file.filename} complete') return make_response(("Chunk upload successful", 200))