Python 函数动态调用
Requirement #
Http -> Aliyun FC -> Python Handle -> dynamic condition -> Template
发票文件OCR解析识别后从JSON文件从中提取需要的信息。
一个供应商一个发票,有多个供应商,根据供应商名称来识别,我这里的做法是一个供应商一个py文件,文件名用供应商代码来命名。
在写的时候我想避免以下代码结构:
import 供应商1.py as s1
import 供应商2.py as s2
if supplier_code == '供应商1':
s1.parse()
elif: supplier_code == '供应商2':
s2.parse()
....
这种方式明显显得不够灵活和动态,于是经过查阅,发现Python可以动态导入模块 于是有了下面代码:
import importlib
import common as util
import traceback
def dispatch(plant:str,supplier:str,file_extension:str,ocr_result_obj):
try:
dynamic_module = importlib.import_module('template.' + supplier)
func_call = getattr(dynamic_module, "parse")
return func_call(plant,supplier,file_extension,ocr_result_obj,util)
except Exception as e:
traceback.print_exc()
except AttributeError as ae:
raise ModuleNotFoundError("调用失败: {0}".format(ae))
except TypeError as te:
raise ModuleNotFoundError("类型异常: {0}".format(te))
except ModuleNotFoundError as e1:
raise ModuleNotFoundError("没有找到模块: {0}".format(e1))
import_module 动态导入,getattr 动态调用模块函数,以此方法达到如果新增供应商解析模板,只要在template目录下新增一个名称对应的py文件,定义parse方法即可,其他的都不用管。
import re
def get_ln(text:str):
pattern = re.compile(r'-?[1-9]\d*')
return pattern.search(text).group()
def get_po(text:str):
pattern = re.compile(r'P\s*X\s*Z\s*[0-9]{5}')
result = pattern.match(text)
if result is not None:
return True
else:
return False
def get_invoice_amount(text:str):
pattern = re.compile(r'00\s*USD$')
result = pattern.search(text)
if result is not None:
return True
else:
return False
def get_ip_no(text:str):
pattern = re.compile(r'^\bde\s*.*\s*[0-9]$', re.IGNORECASE)
result = pattern.search(text)
if result is not None:
return True
else:
return False
def parse(plant,supplier,file_extension,ocr_result_obj,util):
# print("收到参数:{0} {1}".format(plant,str(ocr_result_obj['Headers'])))
PEOPERTY = ""
results = None
if file_extension == '.pdf':
PEOPERTY = "Text"
results = ocr_result_obj['Body']['Data']['Results']
else:
PEOPERTY = "Word"
results = ocr_result_obj['Body']['Data']['WordsInfo']
po = util.find_match_by(results,get_po,PEOPERTY)
invoice_amount = util.find_match_by(results,get_invoice_amount,PEOPERTY)
lp_no = util.find_match_by(results,get_ip_no,PEOPERTY)
ln = util.find_result_by_text('Invoice number',results,get_ln,PEOPERTY)
if po is not None:
po = po.replace(' ','')
if lp_no is not None:
lp_no = util.get_match_number(lp_no)
if invoice_amount is not None:
invoice_amount = invoice_amount.replace(' ','').replace(',','').replace(',','').replace('USD','')
# return [ln,po,invoice_amount,lp_no]
return util.wrapper(ln,invoice_amount,lp_no,po,None,None)
弊端:
被动态调用的函数不能再当前文件引入其他类库,会报错,我通过函数回调来解决,但增加了一些代码的复杂度。