作者:Ramanan Balakrishnan
原文:https://engineering.semantics3.com/2016/11/13/machine-learning-practice-to-production/
我爱机器学习(52ml.net)编者按:机器学习产品化时需要注意哪些问题?数据获取、数据预处理、编程语言/框架选择、训练模型、离线或实时、内嵌或接口、RPC或其它、预测监控、log处理、Online Learning等。
Do I have a reliable source of data? Where do I obtain my dataset?
What pre-processing steps are required? How do I normalize my data before using with my algorithms?
Which language/framework do I use? Python, R, Java, C++? Caffe, Torch, Theano, Tensorflow, DL4J?
How do I train my models? Should I buy GPUs, custom hardware, or ec2 (spot?) instances? Can I parallelize them for speed?
Do I need to make batched or real-time predictions? Embedded models or interfaces? RPC or REST?
How do I keep track of my predictions? Do I log my results to a database? What about online learning?