現居荷蘭的資料科學家。致力於將資料科學與資料工程做完美結合
A data scientist based in the Netherlands. Dedicated to the integration of data science and data engineering.
假日在運河河畔騎著腳踏車冒險、尋找美食。懶得動時就待在咖啡廳寫寫程式
Enjoy the holiday adventure
riding the bike along riverside and looking for good food.
Give me a break! Vegging out in the cafe
and become a coder is my way of relaxing.
It is not easy to objectively and quantitatively measure the level of expertise of a data scientist in each of these areas. On the right (or below) is an assessment of my own skill tree, a relative score I gave myself on the way to learning. The result is this super subjective scale, for your amusement.
依照不同的產業、不同的環境需求,甚至是所處企業的發展階段,一個資料科學家(Data Scientist)所累積的經驗、點的技能樹都會有所不同。 更遑論像我一樣,從計量經濟做到資料工程,最終成為資料科學家的人。
Data Science (資料科學) 本身是一個跨學科的學問,其泛指所有關於收集、處理、展示、分析數據的知識。透過理解資料,從資料中萃取有意義的價值與洞見。而為了達到這目的,從事 Data Science 的人需要同時有數學、統計、計算機科學的知識,且最好要具備基本的編程技術來充實自己的實作能力。
資料科學涵蓋的範圍很大,包括 Data Visualisation(數據視覺化)、Data Engineering(數據工程)、Machine Learning(機器學習)、Data Warehouse(數據倉庫) 等都是資料科學的領域。
要客觀且定量地衡量一個資料科學家在以上各領域的專業程度不容易。右側(或者下方)是我自身的技能樹評估,在學習的路上給自己打的一個相對分數。最後就成了這個超級主觀的衡量表,博君一笑。
July 2019 - Aug 2020
Pricing Center
- The person in charge of Pricing Center.
- Mentor a team of 5 to build a pricing center transforming order data into insights.
- The center covers 13+ millions of products offering a series of applications such as searching engine, product order prediction, aiming to make smarter pricing, promotions, and merchandising decisions with up-to-date competitive insights helping the commercial side to drive profitable growth.
May 2018 - July 2019
Lowest Price Guarantee
- As a role of core developer in LPG project conducted for algorithms training and roadmap structure design.
- Operate natural language processing, Convolutional Neural Network, AutoEncoder, Elasticsearch for product matching and price comparison, keeping 75+ % effective price competitive products online.
July 2017 - May 2018
Warehouse Management System(Shopee24)
- As a role for Full Stack developer constructed Warehouse Searching Engine, Outbound Module, Return Pipeline, and warehouse real-time monitor dashboard with low-latency and high-efficiency design , powered by Flask and JavaScript.
- Use Apache Airflow as a workflow management tool to create ETL data pipelines, accompanying the cronjob tracking the execution for multiple workflows. Maintain efficient Multiple API and Crawler development, building a solid bridge between data warehouse and end-user applications.
July 2016 - Feb 2017
Recommendation system
- Assist the development of recommendation system with Item-Based Collaborative Filtering Algorithms(CF), improving the social media user experience.
Crawler & API-Toolkit
- Operate several crawlers, scraping data from E-commerce platforms like Taobao, PChome and Yahoo for competitor analysis and collecting driver’s open data from Driver Information Service for Vehicle monitoring system model.
- Develop package for machine learning API which is written in Python, like text preprocessing, regular expression and OCR implementation based on Tesseract and Pillow.
Aug 2020 - Current
- Propose a Long & Wide deep neural network(LWN) and provide its implementation with Pytorch.
- Helping the RVO.nl to research the potential approach to predict crop types per parcel in advance.
- The Long & Wide network (LWN) works to capture the crop type rotation pattern in the Long component and achieve the generalization to include large-scale of categorical input without excessive feature engineering effort, by jointly training several nonlinear feature embedding layers in the Wide component (Embedding, MLP).
- The experiments show outstandingly increased in the prediction accuracy (+11%) over the original supervising models.
Sep 2012 - June 2016
- A passionate student, who had the hobby in data science and loved to uncover insights. Proficient in Machine Learning, Positive economics, Econometrics on statistical way, and Business data analytics
Dec 2015 - July 2016
- Organized Hackathon, a design sprint-like event in which computer programmers and others coders are involved in idea development, including designers, subject-matter-experts and others, collaborate intensively on software projects.
Some fascinating projects that you might be interested to know.
不只是資料科學家 ,處在資訊時代的各行各業都需要能活用資料科學的人才。
透過分享自己的學習心得以及業界經驗,我希望讓更多人接觸到資料科學的奧秘。
My Blog will mainly records things like Data Science, Data engineering, Machine Learning and Programming trick.
跟資料科學相關的最新文章直接送到家。
加入訂閱名單,當新文章出爐時,將能馬上收到通知。
Try ChenYuTaiwan. Subscribe and get the latest news.
歡迎各種回饋以及建議,關於資料科學,可以說說你有興趣了解的內容,
或者提醒一些文章需要補足的地方,我會加以改進並考慮寫相關文章。
另外如果你有任何有趣的 Data Science Project 或任何想要分享的東西,
都歡迎透過以下表單聯絡我。
Welcome for any feedback and advice.
Please feel free to contact me via the form below.
New Taipei City
886 Taiwan