RYO SHINりょう しんLIANG ZHEN

ML Engineer & Researcher

Research. Build. Ship. All through language.

See the results

Selected metric

Selected Results

Measured accuracy improvement from a medical-AI project

Medical-term recognition (ASR recall)

Measures how reliably the speech-to-text AI captures the medical terms that belong in a clinical record. Improved by domain-adapting the ASR model on ~190 hours of medical audio.

About

About Me

A language-education researcher who builds AI.

I earned my PhD at Kyoto University researching educational applications of large language models (LLMs). From NINJAL's Vocabulary Profiler to medical ASR fine-tuning and a bid-extraction PoC, I move between research and implementation single-handedly to ship working products. Since April 2026 I have been an engineer at GMO Pepabo.

Kyoto University (M.A. and PhD completed)

github.com/ryoshin0830

Strengths

System design & developmentProject managementResearch & development (R&D)

Specialisation

Web Development

Full-stack delivery centred on Next.js / React / TypeScript / FastAPI / Node.js, owning the loop from requirements to deployment and operations.

ML & LLM Engineering

LoRA fine-tuning, agent design with LangChain / LangGraph, ASR improvement, multi-GPU training via DeepSpeed ZeRO, validation of LLM and multimodal models including stepaudio, phi4, Gemini.

Language Education Systems

Learner-corpus analysis, vocabulary difficulty estimation, automated test generation, extensive-reading platforms — built in close collaboration with classrooms and research institutes.

Specialization

  • ·Foreign Language Education, Second Language Acquisition, Applied Linguistics
  • ·Large Language Model (LLM) Fine-tuning and Development
  • ·Machine Learning for Vocabulary Difficulty Prediction and Language Assessment
  • ·Educational Grammar and English for Academic Purposes

Life Timeline

Professional Experience

Professional Experience

Engagements as a contractor and full-time engineer

Apr 2026 — Present

Full-time

Engineer track· Current

GMO Pepabo, Inc. · IT / Internet

Role
Engineer (Full-time)
Team
362 employees company-wide

Feb 2026 — Mar 2026

Contract

Automated bid extraction & company-scoring PoC for telecom-construction tenders

Sapeet Inc. · AI / SaaS

Stood up a bid-extraction PoC and a company-scoring foundation

Role
Forward Deployed Engineer (Contract)
Team
Org 53 / Team 5

Phases owned

Requirements

High-level design

Detailed design

Implementation

Test & review

Maintenance & ops

Stack

PythonLLM

View detailed responsibilities

Responsibilities

  • ·Owning the bid PoC end-to-end, from requirements through extraction-logic design and refinement
  • ·Designing system topology, deployment operations and access control

Work items

  • ·Built an automated extraction pipeline for telecom-construction tender notices
  • ·Designed and implemented company-scoring logic
  • ·Improved extraction accuracy through iterative prompt design

Jun 2025 — Mar 2026

Contract

SUIREN — speed-reading practice platform for Japanese learners

Massey University (New Zealand) · Education / Research

Built the speed-reading platform 'SUIREN' for Japanese learners

Role
Full-stack (Contract)
Team
Org 3 / Team 3

Phases owned

Requirements

High-level design

Detailed design

Implementation

Test & review

Maintenance & ops

Stack

Next.jsTypeScriptTailwind CSSVercelPostgreSQL

View detailed responsibilities

Responsibilities

  • ·Designed and built a speed-reading practice platform in collaboration with Dr Mitsue Tabata-Sandom's Japanese extensive-reading research at Massey University
  • ·Designed and implemented reading-speed (WPM) and comprehension measurement

Work items

  • ·Level-graded speed-reading content and question flow
  • ·Real-time reading-speed (WPM) and comprehension measurement
  • ·Quiz-style interactive learning
  • ·Progress visualisation and tracking dashboard

Jun 2025 — Jan 2026

Contract

medimo — automated medical-chart generation SaaS

medimo Inc. · Healthcare AI

Improved medical-term ASR recall from 82.26% to 89.72%

Role
Engineer / requirements & design lead (Contract)
Team
Org 40 / Team 25

Phases owned

Requirements

High-level design

Detailed design

Implementation

Test & review

Maintenance & ops

Stack

PythonFastAPITypeScriptReactLangGraphLangChainDifyPyTorchTransformersDeepSpeedAWS AuroraAWSDockerFigmaJupyter

Outcomes

  • Replaced bespoke per-doctor prompt writing with an automated flow that scales across diverse chart formats
View detailed responsibilities

Responsibilities

  • ·Owned requirements through design; also implemented and reviewed code
  • ·PoC, design and implementation of an auto-prompt-generation pipeline (LangGraph) covering per-doctor chart formats (SOAP / chronological etc.)
  • ·Drove ASR accuracy improvement: LoRA fine-tuning on ~190h of medical audio, evaluation design, training operations

Work items

  • ·Defined the spec under which uploaded historical charts auto-derive a per-doctor summary template
  • ·Designed and implemented the LangGraph prompt-generation flow, automatically applying templates per doctor / facility
  • ·Built execution and management surfaces in LangGraph so prompt generation and validation could be operated continuously
  • ·Built the prompt-generation UI and backend with FastAPI + React (collaborated with three engineers)
  • ·Validated and compared LLMs and multimodal models including stepaudio, phi4 and Gemini
  • ·Set up multi-GPU training and experiment workflows with DeepSpeed (ZeRO) to operate a continuous-improvement loop

Nov 2023 — Mar 2025

Contract

Vocabulary Profiler system

NINJAL — National Institute for Japanese Language and Linguistics · Research

Built a vocabulary profiler grounded in Prof. Tatsuhiko Matsushita's vocabulary research (the Vocabulary Database for Reading Japanese, VDRJ): it analyses input Japanese text in real time and visualises it by vocabulary difficulty and usage frequency

Role
Project Lead (Contract)
Team
Org 13 / Team 6

Phases owned

Requirements

High-level design

Detailed design

Implementation

Test & review

Maintenance & ops

Stack

JavaScriptTypeScriptReactNode.jsExpressPostgreSQLVercelAWSDocker

Outcomes

  • Integrated a Word2Vec-based vocabulary-difficulty model as an API, delivering profile output usable in research and classrooms
View detailed responsibilities

Responsibilities

  • ·Led the project end-to-end as PM, from requirements through implementation
  • ·Built the dashboard with React + Node.js
  • ·Implemented real-time data flow over WebSocket

Work items

  • ·Whole-team management and tech selection
  • ·Designed and implemented frontend (React) and backend (Node.js / Express)
  • ·Built the data-processing pipeline for morphological analysis and vocab-difficulty scoring
  • ·Database design and performance tuning on PostgreSQL
  • ·Hardened security with SSL/TLS and token-based auth

Personal Products

Personal Products

Personal projects, OSS and education-domain tools — separate from contract work

Vocabulary Question Auto-Generation System

Automatic generation of Japanese vocabulary questions using Word2Vec and LDA — high-quality distractor generation via distributed representations and topic modelling

Key Features

  • ·Semantic-similarity distractor extraction via Word2Vec
  • ·Context-sensitive distractor generation via LDA topic modelling
  • ·Vocabulary-difficulty estimation via MeCab morphological analysis
  • ·Automatic ML-based scoring of distractor quality

Tech Stack

PythonWord2Vecscikit-learnGensimNLTKMeCab

Research

Research

Innovation in AI Technology for Language Education

  • 4

    Peer-Reviewed Papers

  • 5

    Conference Presentations

  • 1

    Books

Books

  1. 2026

    English Language Education: Current Issues and Future Directions I

    Liang, Z. (Chapter Author)

    Springer Nature (Singapore) · Chapter Author

    View on Springer Nature Link

Peer-Reviewed Papers

  1. 2025

    A Study on Role Language in Manga Used in Daily Conversations: From the Perspective of Japanese Language Education

    Wang, F., Kanamaru, T., & Liang, Z.

    Kotoba, 46, pp. 55–72

  2. 2023

    Development and Validation of an Audio-based Japanese Vocabulary Size Test for Japanese-Chinese Bilinguals

    Peng, Y., Liang, Z., & Sasao, Y.

    Japanese Language Education, 185, pp. 93–108

  3. 2023

    Motion and memory in VR: The influence of VR control method on memorization of foreign language orthography

    Vincent, N. H., Liang, Z., & Sasao, Y.

    International Journal on Cybernetics & Informatics (IJCI), 12(1), pp. 151–164

  4. 2022

    Subtitle Use in Japanese Learning through Visual Media: From the Perspective of Language Selection

    Peng, Y., Liang, Z., & Sasao, Y.

    Journal of Language and Cultural Education Research, 20, pp. 335–356

Conference Presentations

  1. 2025

    Collocation Analysis of Authorized English Textbooks under the New Course of Study: From the Perspective of Continuity from Elementary to High School

    Nakano, T., & Liang, Z.

    JASELE 50th Saitama Research Conference

  2. 2024

    Analysis of High-Frequency Collocations Based on a Corpus of Authorized English Textbooks

    Nakano, T., Liang, Z., & Sasao, Y.

    JASELE 49th Fukuoka Research Conference

  3. 2024

    Can General-Purpose LLMs Predict Vocabulary Difficulty Based on Japanese Learner Data?

    Liang, Z., & Sasao, Y.

    Association for Natural Language Processing 2024

  4. 2022

    Development and Validation of an Audio-based Japanese Vocabulary Size Test for Japanese-Chinese Bilinguals

    Peng, Y., Liang, Z., & Sasao, Y.

    Society for Teaching Japanese as a Foreign Language Autumn Conference Proceedings

  5. 2022

    Development and Validation of an Automatic Distractor Generation Program for Japanese Vocabulary Questions

    Liang, Z., & Sasao, Y.

    Society for Teaching Japanese as a Foreign Language Spring Conference Proceedings

Stack

Skills & Stack

Technologies I've used across work, research, and personal projects, grouped by proficiency

Programming Languages

Core
JavaScriptTypeScriptPython
Proficient
SwiftPHP

Frontend

Core
ReactNext.js
Proficient
Tailwind CSS
Familiar
React Native

Backend

Core
FastAPI
Proficient
Node.js / ExpressWordPress

AI & Machine Learning

Core
LangGraph
Proficient
LangChainPyTorchTransformersDeepSpeedLoRADifyScikit-learnWord2Vec / gensimMeCabJupyter

Databases

Core
PostgreSQL
Proficient
MariaDB / MySQLAWS Aurora
Familiar
MongoDB

Infrastructure

Core
AWS
Proficient
VercelDockerLinuxNginxApacheCaddy

Tools

Proficient
GitFigma

Languages

  • JapaneseJLPT N1 Perfect Score, Native Level
  • ChineseNative Speaker (Beijing)
  • EnglishCET Level 4, Academic Writing Level

Certifications

  • 2025

    CATTI International (Translation)

    Chinese-Japanese Translation - China Accreditation Test for Translators and Interpreters (International)

  • 2025

    CATTI International (Interpretation)

    Chinese-Japanese Interpretation - China Accreditation Test for Translators and Interpreters (International)

  • 2022

    Japanese Driver's License

    Standard automobile license issued in Japan

  • 2020

    JLPT N1 (Perfect Score)

    Japanese Language Proficiency Test - Perfect score at highest level

  • 2019

    ICT Proficiency Test

    Information and Communication Technology certification

  • 2018

    Chinese Driver's License

    Standard automobile license issued in China

  • 2018

    CET-4 (College English Test Band 4)

    Standardized English proficiency test for Chinese university students