Open in app

Sign In

Write

Sign In

Tianchen Wu
Tianchen Wu

294 Followers

Home

About

Dec 15, 2022

Unit Test SQL using dbt

After years of working on data science and engineering, data quality is the hanging ghost appears in almost every project, decimating business achievement. SQL is the de facto language of Data. One way of improving data quality is to enhance SQL codebase with unit test and data test. This article…

Data Engineering

4 min read

Unit Test SQL using dbt
Unit Test SQL using dbt
Data Engineering

4 min read


Published in

FAUN Publication

·Oct 4, 2021

Common Patterns of Infrastructure as Code Architecture — Terraform and Terragrunt

Introduction This article discusses common patterns in Infrastructure as Code(IaC) to summarize and highlight the design principle underlying good architecture. Terraform is a well-known IaC tool from Hashicorp. Terragrunt is a thin wrapper of terraform that provides extra tools for keeping your configurations DRY, working with multiple Terraform modules, and managing…

Terraform

6 min read

Common Patterns of Infrastructure as Code Architecture — Terraform and Terragrunt
Common Patterns of Infrastructure as Code Architecture — Terraform and Terragrunt
Terraform

6 min read


Published in

FAUN Publication

·Aug 8, 2021

Terraform Migrate

Introduction This article studies basic principles of migrating terraform tfstate. It can happen due to big infrastructure code refactoring. The essence of migration is to keep alignment between terraform script and its corresponding tfstate (i.e. in aspects of resource addressing, configurtion etc) between tfstate and reality in cloud between source tfstate…

Terraform

4 min read

Terraform Migrate
Terraform Migrate
Terraform

4 min read


Published in

FAUN Publication

·Jun 21, 2021

Containerize Development Environment with Visual Studio Code

Introduction 工欲善其事, 必先利其器 There is an old chinese saying, “To do a good job, an artisan needs the best tools”. My toolbox to boost development productivity is Visual Studio Code + Docker. Using Visual Studio Code + remote container facilitates the provision of full-featured IDE with docker containers and installation…

3 min read

Containerize Development Environment with Visual Studio Code
Containerize Development Environment with Visual Studio Code

3 min read


Published in

FAUN Publication

·Oct 5, 2020

Terraform at Scale — Modualized Hierachical Layout and Continuous Delivery of Infrastructure

Introduction This article describes a systematic way of applying terraform at scale. At scale refers to: high technical complexity: deploy infrastructure to any number of accounts and cloud providers high organizational complexity: enable multiple teams of developers to work collaboratively The requirement calls for a modularized and hierarchical architecture as well…

Terraform

6 min read

Terraform at Scale — Modualized Hierachical Layout
Terraform at Scale — Modualized Hierachical Layout
Terraform

6 min read


Published in

Towards Data Science

·Oct 2, 2019

Enable ML Experiments

Forward Recently inspired by talks from Dmitry Petrov about Machine learning model and dataset versioning practices and Data versioning in machine learning projects. In the majority of the current ML systems, there is a lack of efficient and systematic ways to deliver the value of data through Data Science into the…

Git

6 min read

Enable ML Experiments
Enable ML Experiments
Git

6 min read


Sep 14, 2019

Run Gitlab Pipeline DRY

Gitlab CI/CD pipelines are configured using a yaml file called .gitlab-ci.yml within each project. yaml is a feature-rich data serialization language, if combining it with gitlab CI/CD features we can keep CI/CD pipelines DRY — an important criteria of deployment pipeline as code. Introduction Gitlab CI/CD pipelines are composed of jobs. …

DevOps

4 min read

Run Gitlab Pipeline DRY
Run Gitlab Pipeline DRY
DevOps

4 min read


Published in

Towards Data Science

·Aug 11, 2019

Version Control ML Model

Machine Learning operations (let’s call it mlOps under the current buzzword pattern xxOps) are quite different from traditional software development operations (devOps). One of the reasons is that ML experiments demand large dataset and model artifact besides code (small plain file). This post presents a solution to version control machine…

Git

3 min read

Version Control ML Model
Version Control ML Model
Git

3 min read


Apr 4, 2019

Migrate MS Access MiniApp to Cloud Serverless Stack

Twenty years domination of Microsoft Access , nowadays more and more companys choose to migrate their services to cloud for lower cost and higher elasticity. This blog tells a story of migrating a MS Access Miniapp to aws serverless stack. This MiniApp has three parts:

AWS

3 min read

Migrate MS Access MiniApp to Cloud Serverless Stack
Migrate MS Access MiniApp to Cloud Serverless Stack
AWS

3 min read


Published in

Towards Data Science

·Aug 5, 2018

Implement CRISP Data Science with AWS SageMaker

This article aims to demonstrate the capability and agility of AWS to develop and host both industry-standard machine learning products and research-level algorithms. The concept is validated with an implementation of CRISP Data Science Workflow on AWS cloud using “object detection algorithm” example (https://github.com/wutianchen/object-detection-algo). Separation of Concerns in Data Science CRISP (Cross-industry standard process for data…

AWS

4 min read

Implement CRISP Data Science with AWS SageMaker
Implement CRISP Data Science with AWS SageMaker
AWS

4 min read

Tianchen Wu

Tianchen Wu

294 Followers

Data & Cloud Engineer

Following
  • Netflix Technology Blog

    Netflix Technology Blog

  • Stefan Kojouharov

    Stefan Kojouharov

  • Dr Mark van Rijmenam, CSP

    Dr Mark van Rijmenam, CSP

  • Anton Babenko

    Anton Babenko

  • Daryl Pereira

    Daryl Pereira

See all (324)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech

Teams