Content: - 16 chapters (book/) across 4 Bagian - 32 diagram assets (assets/diagrams/) - Front/back matter (halaman judul, daftar isi/gambar/tabel, pustaka, glosarium, indeks, lampiran, tentang penulis) - 16 worksheets, 16 templates - Discussion modules (docs/) - BLUEPRINT, BOOK-SPEC, MASTER-ANCHOR, REFERENCES, PROJECT-TRACKER
62 lines
1.9 KiB
Markdown
62 lines
1.9 KiB
Markdown
# WS-13: Preprocessing
|
|
> **Bab Terkait:** Bab 13 — Data Preprocessing
|
|
> **Tujuan:** Menangani missing values, membuat pipeline preprocessing, dan mendeteksi leakage
|
|
> **Referensi:** Lampiran B.13 | Template A.13
|
|
|
|
---
|
|
|
|
## Latihan 1 — Missing Value Strategy
|
|
|
|
**Dataset:** ____________________________________________
|
|
**Persentase missing values:** __________________________
|
|
|
|
| Strategi | Rata-rata Setelah | Kesimpulan Perbandingan Berubah? |
|
|
|----------|------------------|-------------------------------|
|
|
| Listwise deletion | | [ ] Ya / [ ] Tidak |
|
|
| Mean imputation | | [ ] Ya / [ ] Tidak |
|
|
| Flag & report | | [ ] Ya / [ ] Tidak |
|
|
|
|
**Strategi yang dipilih:** _______________________________
|
|
**Justifikasi:** ________________________________________
|
|
|
|
---
|
|
|
|
## Latihan 2 — Preprocessing Pipeline
|
|
|
|
**Bahasa/tool yang digunakan:** __________________________
|
|
|
|
| Step | Operasi | Input | Output | Komentar |
|
|
|------|---------|-------|--------|---------|
|
|
| 1 | Cleaning | | | |
|
|
| 2 | Encoding (jika perlu) | | | |
|
|
| 3 | Normalisasi (jika perlu) | | | |
|
|
| 4 | Feature engineering (jika perlu) | | | |
|
|
|
|
**Script/file referensi:** ______________________________
|
|
|
|
---
|
|
|
|
## Latihan 3 — Leakage Detection
|
|
|
|
| Potensi Leakage | Ditemukan? | Perbaikan |
|
|
|----------------|-----------|----------|
|
|
| Test data masuk ke training | [ ] Ya / [ ] Tidak | |
|
|
| Future information di features | [ ] Ya / [ ] Tidak | |
|
|
| Target variable di preprocessing | [ ] Ya / [ ] Tidak | |
|
|
| Normalisasi sebelum split | [ ] Ya / [ ] Tidak | |
|
|
|
|
**Kesimpulan leakage check:**
|
|
- [ ] Tidak ada leakage — karena: _______________________
|
|
- [ ] Ada leakage — diperbaiki dengan: ___________________
|
|
|
|
---
|
|
|
|
## Refleksi
|
|
|
|
> *"Jika saya menghapus satu baris data — bisakah saya menjelaskan mengapa, dan apakah orang lain akan setuju?"*
|
|
|
|
**Jawaban refleksi:**
|
|
> ___________________________________________________
|
|
|
|
---
|
|
<!-- Worksheet dari Bab 13 — Data Preprocessing -->
|