riset-teknologi-informasi/slide/slide-13-data-preprocessing.md
hb_alim e3e1e8db41 feat: add slide deck and book prompt template
- slide/: 16 Marp slide files with inline UPB CSS theme
  (slide-01 through slide-16, covering all RTI-20252 topics)
- slide/theme/: upb.css canonical theme + logo-upb.png
- docs/AI-BOOK-PROMPT-TEMPLATE.md: RTI-20252 book authoring prompt
2026-04-13 15:04:45 +07:00

1464 lines
40 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
marp: true
paginate: true
class: bagian-iv
header: 'RTI — Riset Teknologi Informasi | Universitas Putra Bangsa Kebumen'
footer: 'Helmi Bahar Alim, S.Kom., M.Kom. | 2026'
---
<style>
/* UPB Theme (inline) */
/*
============================================================
UPB MARP THEME — Riset Teknologi Informasi
Universitas Putra Bangsa (UPB), Kebumen
Fak. Sains & Teknologi | Prodi Teknik Informatika
------------------------------------------------------------
Penggunaan di frontmatter slide:
theme: upb
class: bagian-ii ← opsional; ganti warna Bagian
Kelas per Bagian:
(kosong / default) = Bagian I — Biru #2563EB
bagian-ii = Bagian II — Hijau #059669
bagian-iii = Bagian III — Oranye #d97706
bagian-iv = Bagian IV — Ungu #7c3aed
Kelas layout khusus (gunakan via <!-- _class: ... -->):
cover = Cover / halaman judul
section-header = Pembatas antar-topik
integrative = Bab 8 (UTS — gradien biru-ungu)
fullcircle = Bab 16 penutup (gradien gelap)
============================================================
*/
/* ============================================================
1. CSS CUSTOM PROPERTIES — DEFAULT (Bagian I · Biru)
============================================================ */
section {
--accent: #2563EB;
--accent-dark: #1e3a5f;
--accent-light: #eff6ff;
--accent-border: #bfdbfe;
--cover-grad: linear-gradient(135deg, #1e3a5f 0%, #2563EB 100%);
--cover-sub: #bfdbfe;
--cover-meta: #93c5fd;
font-family: 'Segoe UI', Arial, sans-serif;
font-size: 1.1em;
color: #1e293b;
padding: 40px 60px;
}
/* ============================================================
2. VARIAN WARNA PER BAGIAN
============================================================ */
/* Bagian II — Hijau */
section.bagian-ii {
--accent: #059669;
--accent-dark: #064e3b;
--accent-light: #ecfdf5;
--accent-border: #a7f3d0;
--cover-grad: linear-gradient(135deg, #064e3b 0%, #059669 100%);
--cover-sub: #a7f3d0;
--cover-meta: #6ee7b7;
}
/* Bagian III — Oranye */
section.bagian-iii {
--accent: #d97706;
--accent-dark: #78350f;
--accent-light: #fffbeb;
--accent-border: #fde68a;
--cover-grad: linear-gradient(135deg, #78350f 0%, #d97706 100%);
--cover-sub: #fde68a;
--cover-meta: #fcd34d;
}
/* Bagian IV — Ungu */
section.bagian-iv {
--accent: #7c3aed;
--accent-dark: #3b0764;
--accent-light: #f5f3ff;
--accent-border: #ddd6fe;
--cover-grad: linear-gradient(135deg, #3b0764 0%, #7c3aed 100%);
--cover-sub: #ddd6fe;
--cover-meta: #c4b5fd;
}
/* ============================================================
3. LAYOUT: COVER
============================================================ */
section.cover {
background: var(--cover-grad);
color: white;
justify-content: center;
text-align: center;
}
/* Logo dimuat dari CSS — tidak perlu tag img di markdown */
section.cover::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.cover h1 {
color: white;
font-size: 2em;
margin-bottom: 8px;
border-bottom: none;
}
section.cover h2 {
color: var(--cover-sub);
font-size: 1.1em;
font-weight: normal;
}
section.cover p { color: var(--cover-meta); font-size: 0.85em; }
section.cover strong { color: white; }
/* ============================================================
4. LAYOUT: SECTION HEADER (pembatas topik)
============================================================ */
section.section-header {
background: var(--accent);
color: white;
justify-content: center;
text-align: center;
}
section.section-header h1 {
color: white;
font-size: 2.5em;
border-bottom: none;
}
section.section-header h2 { color: rgba(255, 255, 255, 0.85); }
/* ============================================================
5. LAYOUT: INTEGRATIVE (Bab 8 — UTS Checkpoint)
============================================================ */
section.integrative {
background: linear-gradient(135deg, #1e3a5f 0%, #7c3aed 100%);
color: white;
justify-content: center;
text-align: center;
}
section.integrative::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.integrative h1 {
color: white;
font-size: 2.2em;
border-bottom: none;
}
section.integrative h2 { color: #ddd6fe; font-size: 1.1em; }
section.integrative p { color: #c4b5fd; font-size: 0.85em; }
section.integrative strong { color: white; }
/* ============================================================
6. LAYOUT: FULLCIRCLE (Bab 16 — Penutup)
============================================================ */
section.fullcircle {
background: linear-gradient(135deg, #1e293b 0%, #1e3a5f 50%, #1e293b 100%);
color: white;
justify-content: center;
text-align: center;
}
section.fullcircle h1 {
color: #ddd6fe;
font-size: 2.2em;
border-bottom: 3px solid #7c3aed;
}
section.fullcircle blockquote {
border-left: 5px solid #7c3aed;
background: rgba(255, 255, 255, 0.08);
color: #ddd6fe;
font-style: italic;
}
/* ============================================================
7. ELEMEN KONTEN — menggunakan CSS vars dari bagian
============================================================ */
h1 {
color: var(--accent);
border-bottom: 3px solid var(--accent);
padding-bottom: 8px;
}
h2 { color: var(--accent-dark); font-size: 1.3em; }
h3 { color: var(--accent); font-size: 1.05em; }
blockquote {
border-left: 5px solid var(--accent);
background: var(--accent-light);
padding: 12px 20px;
margin: 16px 0;
color: var(--accent-dark);
font-style: italic;
border-radius: 0 8px 8px 0;
}
table { font-size: 0.82em; width: 100%; border-collapse: collapse; }
th { background: var(--accent); color: white; padding: 8px 12px; }
td { padding: 6px 12px; border-bottom: 1px solid var(--accent-border); }
tr:nth-child(even) td { background: var(--accent-light); }
code {
background: var(--accent-light);
color: var(--accent-dark);
padding: 2px 6px;
border-radius: 4px;
font-size: 0.9em;
}
pre {
background: #f1f5f9;
color: #1e293b;
padding: 16px;
border-radius: 8px;
border-left: 4px solid var(--accent-border);
}
ul li, ol li { margin-bottom: 6px; line-height: 1.6; }
/* ============================================================
8. HELPER CLASSES
============================================================ */
/* Status / penekanan */
.warn { color: #d97706; font-weight: bold; }
.good { color: #059669; font-weight: bold; }
.bad { color: #dc2626; font-weight: bold; }
/* Kotak pernyataan akhir */
.final {
background: #fef3c7;
border-left: 5px solid #d97706;
padding: 14px 20px;
border-radius: 0 8px 8px 0;
font-size: 1.1em;
}
/* Kotak highlight */
.highlight-box {
background: var(--accent);
color: white;
padding: 16px 20px;
border-radius: 8px;
margin: 12px 0;
}
/* Kotak checkpoint bab 8 */
.checkpoint {
background: #f5f3ff;
border: 2px solid #7c3aed;
border-radius: 8px;
padding: 16px 24px;
margin: 16px 0;
}
/* ============================================================
9. PAGINATION & HEADER/FOOTER
============================================================ */
section::after {
font-size: 0.7em;
color: #94a3b8;
}
section[data-marpit-advanced-background] > div:last-child { padding: 40px 60px; }
</style>
<style>
/* UPB Theme (inline) */
/*
============================================================
UPB MARP THEME — Riset Teknologi Informasi
Universitas Putra Bangsa (UPB), Kebumen
Fak. Sains & Teknologi | Prodi Teknik Informatika
------------------------------------------------------------
Penggunaan di frontmatter slide:
class: bagian-ii ← opsional; ganti warna Bagian
Kelas per Bagian:
(kosong / default) = Bagian I — Biru #2563EB
bagian-ii = Bagian II — Hijau #059669
bagian-iii = Bagian III — Oranye #d97706
bagian-iv = Bagian IV — Ungu #7c3aed
Kelas layout khusus (gunakan via <!-- _class: ... -->):
cover = Cover / halaman judul
section-header = Pembatas antar-topik
integrative = Bab 8 (UTS — gradien biru-ungu)
fullcircle = Bab 16 penutup (gradien gelap)
============================================================
*/
/* ============================================================
1. CSS CUSTOM PROPERTIES — DEFAULT (Bagian I · Biru)
============================================================ */
section {
--accent: #2563EB;
--accent-dark: #1e3a5f;
--accent-light: #eff6ff;
--accent-border: #bfdbfe;
--cover-grad: linear-gradient(135deg, #1e3a5f 0%, #2563EB 100%);
--cover-sub: #bfdbfe;
--cover-meta: #93c5fd;
font-family: 'Segoe UI', Arial, sans-serif;
font-size: 1.1em;
color: #1e293b;
padding: 40px 60px;
}
/* ============================================================
2. VARIAN WARNA PER BAGIAN
============================================================ */
/* Bagian II — Hijau */
section.bagian-ii {
--accent: #059669;
--accent-dark: #064e3b;
--accent-light: #ecfdf5;
--accent-border: #a7f3d0;
--cover-grad: linear-gradient(135deg, #064e3b 0%, #059669 100%);
--cover-sub: #a7f3d0;
--cover-meta: #6ee7b7;
}
/* Bagian III — Oranye */
section.bagian-iii {
--accent: #d97706;
--accent-dark: #78350f;
--accent-light: #fffbeb;
--accent-border: #fde68a;
--cover-grad: linear-gradient(135deg, #78350f 0%, #d97706 100%);
--cover-sub: #fde68a;
--cover-meta: #fcd34d;
}
/* Bagian IV — Ungu */
section.bagian-iv {
--accent: #7c3aed;
--accent-dark: #3b0764;
--accent-light: #f5f3ff;
--accent-border: #ddd6fe;
--cover-grad: linear-gradient(135deg, #3b0764 0%, #7c3aed 100%);
--cover-sub: #ddd6fe;
--cover-meta: #c4b5fd;
}
/* ============================================================
3. LAYOUT: COVER
============================================================ */
section.cover {
background: var(--cover-grad);
color: white;
justify-content: center;
text-align: center;
}
/* Logo dimuat dari CSS — tidak perlu tag img di markdown */
section.cover::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.cover h1 {
color: white;
font-size: 2em;
margin-bottom: 8px;
border-bottom: none;
}
section.cover h2 {
color: var(--cover-sub);
font-size: 1.1em;
font-weight: normal;
}
section.cover p { color: var(--cover-meta); font-size: 0.85em; }
section.cover strong { color: white; }
/* ============================================================
4. LAYOUT: SECTION HEADER (pembatas topik)
============================================================ */
section.section-header {
background: var(--accent);
color: white;
justify-content: center;
text-align: center;
}
section.section-header h1 {
color: white;
font-size: 2.5em;
border-bottom: none;
}
section.section-header h2 { color: rgba(255, 255, 255, 0.85); }
/* ============================================================
5. LAYOUT: INTEGRATIVE (Bab 8 — UTS Checkpoint)
============================================================ */
section.integrative {
background: linear-gradient(135deg, #1e3a5f 0%, #7c3aed 100%);
color: white;
justify-content: center;
text-align: center;
}
section.integrative::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.integrative h1 {
color: white;
font-size: 2.2em;
border-bottom: none;
}
section.integrative h2 { color: #ddd6fe; font-size: 1.1em; }
section.integrative p { color: #c4b5fd; font-size: 0.85em; }
section.integrative strong { color: white; }
/* ============================================================
6. LAYOUT: FULLCIRCLE (Bab 16 — Penutup)
============================================================ */
section.fullcircle {
background: linear-gradient(135deg, #1e293b 0%, #1e3a5f 50%, #1e293b 100%);
color: white;
justify-content: center;
text-align: center;
}
section.fullcircle h1 {
color: #ddd6fe;
font-size: 2.2em;
border-bottom: 3px solid #7c3aed;
}
section.fullcircle blockquote {
border-left: 5px solid #7c3aed;
background: rgba(255, 255, 255, 0.08);
color: #ddd6fe;
font-style: italic;
}
/* ============================================================
7. ELEMEN KONTEN — menggunakan CSS vars dari bagian
============================================================ */
h1 {
color: var(--accent);
border-bottom: 3px solid var(--accent);
padding-bottom: 8px;
}
h2 { color: var(--accent-dark); font-size: 1.3em; }
h3 { color: var(--accent); font-size: 1.05em; }
blockquote {
border-left: 5px solid var(--accent);
background: var(--accent-light);
padding: 12px 20px;
margin: 16px 0;
color: var(--accent-dark);
font-style: italic;
border-radius: 0 8px 8px 0;
}
table { font-size: 0.82em; width: 100%; border-collapse: collapse; }
th { background: var(--accent); color: white; padding: 8px 12px; }
td { padding: 6px 12px; border-bottom: 1px solid var(--accent-border); }
tr:nth-child(even) td { background: var(--accent-light); }
code {
background: var(--accent-light);
color: var(--accent-dark);
padding: 2px 6px;
border-radius: 4px;
font-size: 0.9em;
}
pre {
background: #f1f5f9;
color: #1e293b;
padding: 16px;
border-radius: 8px;
border-left: 4px solid var(--accent-border);
}
ul li, ol li { margin-bottom: 6px; line-height: 1.6; }
/* ============================================================
8. HELPER CLASSES
============================================================ */
/* Status / penekanan */
.warn { color: #d97706; font-weight: bold; }
.good { color: #059669; font-weight: bold; }
.bad { color: #dc2626; font-weight: bold; }
/* Kotak pernyataan akhir */
.final {
background: #fef3c7;
border-left: 5px solid #d97706;
padding: 14px 20px;
border-radius: 0 8px 8px 0;
font-size: 1.1em;
}
/* Kotak highlight */
.highlight-box {
background: var(--accent);
color: white;
padding: 16px 20px;
border-radius: 8px;
margin: 12px 0;
}
/* Kotak checkpoint bab 8 */
.checkpoint {
background: #f5f3ff;
border: 2px solid #7c3aed;
border-radius: 8px;
padding: 16px 24px;
margin: 16px 0;
}
/* ============================================================
9. PAGINATION & HEADER/FOOTER
============================================================ */
section::after {
font-size: 0.7em;
color: #94a3b8;
}
section[data-marpit-advanced-background] > div:last-child { padding: 40px 60px; }
</style>
<style>
/* UPB Theme (inline) */
/*
============================================================
UPB MARP THEME — Riset Teknologi Informasi
Universitas Putra Bangsa (UPB), Kebumen
Fak. Sains & Teknologi | Prodi Teknik Informatika
------------------------------------------------------------
Penggunaan di frontmatter slide:
class: bagian-ii ← opsional; ganti warna Bagian
Kelas per Bagian:
(kosong / default) = Bagian I — Biru #2563EB
bagian-ii = Bagian II — Hijau #059669
bagian-iii = Bagian III — Oranye #d97706
bagian-iv = Bagian IV — Ungu #7c3aed
Kelas layout khusus (gunakan via <!-- _class: ... -->):
cover = Cover / halaman judul
section-header = Pembatas antar-topik
integrative = Bab 8 (UTS — gradien biru-ungu)
fullcircle = Bab 16 penutup (gradien gelap)
============================================================
*/
/* ============================================================
1. CSS CUSTOM PROPERTIES — DEFAULT (Bagian I · Biru)
============================================================ */
section {
--accent: #2563EB;
--accent-dark: #1e3a5f;
--accent-light: #eff6ff;
--accent-border: #bfdbfe;
--cover-grad: linear-gradient(135deg, #1e3a5f 0%, #2563EB 100%);
--cover-sub: #bfdbfe;
--cover-meta: #93c5fd;
font-family: 'Segoe UI', Arial, sans-serif;
font-size: 1.1em;
color: #1e293b;
padding: 40px 60px;
}
/* ============================================================
2. VARIAN WARNA PER BAGIAN
============================================================ */
/* Bagian II — Hijau */
section.bagian-ii {
--accent: #059669;
--accent-dark: #064e3b;
--accent-light: #ecfdf5;
--accent-border: #a7f3d0;
--cover-grad: linear-gradient(135deg, #064e3b 0%, #059669 100%);
--cover-sub: #a7f3d0;
--cover-meta: #6ee7b7;
}
/* Bagian III — Oranye */
section.bagian-iii {
--accent: #d97706;
--accent-dark: #78350f;
--accent-light: #fffbeb;
--accent-border: #fde68a;
--cover-grad: linear-gradient(135deg, #78350f 0%, #d97706 100%);
--cover-sub: #fde68a;
--cover-meta: #fcd34d;
}
/* Bagian IV — Ungu */
section.bagian-iv {
--accent: #7c3aed;
--accent-dark: #3b0764;
--accent-light: #f5f3ff;
--accent-border: #ddd6fe;
--cover-grad: linear-gradient(135deg, #3b0764 0%, #7c3aed 100%);
--cover-sub: #ddd6fe;
--cover-meta: #c4b5fd;
}
/* ============================================================
3. LAYOUT: COVER
============================================================ */
section.cover {
background: var(--cover-grad);
color: white;
justify-content: center;
text-align: center;
}
/* Logo dimuat dari CSS — tidak perlu tag img di markdown */
section.cover::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.cover h1 {
color: white;
font-size: 2em;
margin-bottom: 8px;
border-bottom: none;
}
section.cover h2 {
color: var(--cover-sub);
font-size: 1.1em;
font-weight: normal;
}
section.cover p { color: var(--cover-meta); font-size: 0.85em; }
section.cover strong { color: white; }
/* ============================================================
4. LAYOUT: SECTION HEADER (pembatas topik)
============================================================ */
section.section-header {
background: var(--accent);
color: white;
justify-content: center;
text-align: center;
}
section.section-header h1 {
color: white;
font-size: 2.5em;
border-bottom: none;
}
section.section-header h2 { color: rgba(255, 255, 255, 0.85); }
/* ============================================================
5. LAYOUT: INTEGRATIVE (Bab 8 — UTS Checkpoint)
============================================================ */
section.integrative {
background: linear-gradient(135deg, #1e3a5f 0%, #7c3aed 100%);
color: white;
justify-content: center;
text-align: center;
}
section.integrative::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.integrative h1 {
color: white;
font-size: 2.2em;
border-bottom: none;
}
section.integrative h2 { color: #ddd6fe; font-size: 1.1em; }
section.integrative p { color: #c4b5fd; font-size: 0.85em; }
section.integrative strong { color: white; }
/* ============================================================
6. LAYOUT: FULLCIRCLE (Bab 16 — Penutup)
============================================================ */
section.fullcircle {
background: linear-gradient(135deg, #1e293b 0%, #1e3a5f 50%, #1e293b 100%);
color: white;
justify-content: center;
text-align: center;
}
section.fullcircle h1 {
color: #ddd6fe;
font-size: 2.2em;
border-bottom: 3px solid #7c3aed;
}
section.fullcircle blockquote {
border-left: 5px solid #7c3aed;
background: rgba(255, 255, 255, 0.08);
color: #ddd6fe;
font-style: italic;
}
/* ============================================================
7. ELEMEN KONTEN — menggunakan CSS vars dari bagian
============================================================ */
h1 {
color: var(--accent);
border-bottom: 3px solid var(--accent);
padding-bottom: 8px;
}
h2 { color: var(--accent-dark); font-size: 1.3em; }
h3 { color: var(--accent); font-size: 1.05em; }
blockquote {
border-left: 5px solid var(--accent);
background: var(--accent-light);
padding: 12px 20px;
margin: 16px 0;
color: var(--accent-dark);
font-style: italic;
border-radius: 0 8px 8px 0;
}
table { font-size: 0.82em; width: 100%; border-collapse: collapse; }
th { background: var(--accent); color: white; padding: 8px 12px; }
td { padding: 6px 12px; border-bottom: 1px solid var(--accent-border); }
tr:nth-child(even) td { background: var(--accent-light); }
code {
background: var(--accent-light);
color: var(--accent-dark);
padding: 2px 6px;
border-radius: 4px;
font-size: 0.9em;
}
pre {
background: #f1f5f9;
color: #1e293b;
padding: 16px;
border-radius: 8px;
border-left: 4px solid var(--accent-border);
}
ul li, ol li { margin-bottom: 6px; line-height: 1.6; }
/* ============================================================
8. HELPER CLASSES
============================================================ */
/* Status / penekanan */
.warn { color: #d97706; font-weight: bold; }
.good { color: #059669; font-weight: bold; }
.bad { color: #dc2626; font-weight: bold; }
/* Kotak pernyataan akhir */
.final {
background: #fef3c7;
border-left: 5px solid #d97706;
padding: 14px 20px;
border-radius: 0 8px 8px 0;
font-size: 1.1em;
}
/* Kotak highlight */
.highlight-box {
background: var(--accent);
color: white;
padding: 16px 20px;
border-radius: 8px;
margin: 12px 0;
}
/* Kotak checkpoint bab 8 */
.checkpoint {
background: #f5f3ff;
border: 2px solid #7c3aed;
border-radius: 8px;
padding: 16px 24px;
margin: 16px 0;
}
/* ============================================================
9. PAGINATION & HEADER/FOOTER
============================================================ */
section::after {
font-size: 0.7em;
color: #94a3b8;
}
section[data-marpit-advanced-background] > div:last-child { padding: 40px 60px; }
</style>
<style>
/* UPB Theme (inline) */
/*
============================================================
UPB MARP THEME — Riset Teknologi Informasi
Universitas Putra Bangsa (UPB), Kebumen
Fak. Sains & Teknologi | Prodi Teknik Informatika
------------------------------------------------------------
Penggunaan di frontmatter slide:
class: bagian-ii ← opsional; ganti warna Bagian
Kelas per Bagian:
(kosong / default) = Bagian I — Biru #2563EB
bagian-ii = Bagian II — Hijau #059669
bagian-iii = Bagian III — Oranye #d97706
bagian-iv = Bagian IV — Ungu #7c3aed
Kelas layout khusus (gunakan via <!-- _class: ... -->):
cover = Cover / halaman judul
section-header = Pembatas antar-topik
integrative = Bab 8 (UTS — gradien biru-ungu)
fullcircle = Bab 16 penutup (gradien gelap)
============================================================
*/
/* ============================================================
1. CSS CUSTOM PROPERTIES — DEFAULT (Bagian I · Biru)
============================================================ */
section {
--accent: #2563EB;
--accent-dark: #1e3a5f;
--accent-light: #eff6ff;
--accent-border: #bfdbfe;
--cover-grad: linear-gradient(135deg, #1e3a5f 0%, #2563EB 100%);
--cover-sub: #bfdbfe;
--cover-meta: #93c5fd;
font-family: 'Segoe UI', Arial, sans-serif;
font-size: 1.1em;
color: #1e293b;
padding: 40px 60px;
}
/* ============================================================
2. VARIAN WARNA PER BAGIAN
============================================================ */
/* Bagian II — Hijau */
section.bagian-ii {
--accent: #059669;
--accent-dark: #064e3b;
--accent-light: #ecfdf5;
--accent-border: #a7f3d0;
--cover-grad: linear-gradient(135deg, #064e3b 0%, #059669 100%);
--cover-sub: #a7f3d0;
--cover-meta: #6ee7b7;
}
/* Bagian III — Oranye */
section.bagian-iii {
--accent: #d97706;
--accent-dark: #78350f;
--accent-light: #fffbeb;
--accent-border: #fde68a;
--cover-grad: linear-gradient(135deg, #78350f 0%, #d97706 100%);
--cover-sub: #fde68a;
--cover-meta: #fcd34d;
}
/* Bagian IV — Ungu */
section.bagian-iv {
--accent: #7c3aed;
--accent-dark: #3b0764;
--accent-light: #f5f3ff;
--accent-border: #ddd6fe;
--cover-grad: linear-gradient(135deg, #3b0764 0%, #7c3aed 100%);
--cover-sub: #ddd6fe;
--cover-meta: #c4b5fd;
}
/* ============================================================
3. LAYOUT: COVER
============================================================ */
section.cover {
background: var(--cover-grad);
color: white;
justify-content: center;
text-align: center;
}
/* Logo dimuat dari CSS — tidak perlu tag img di markdown */
section.cover::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.cover h1 {
color: white;
font-size: 2em;
margin-bottom: 8px;
border-bottom: none;
}
section.cover h2 {
color: var(--cover-sub);
font-size: 1.1em;
font-weight: normal;
}
section.cover p { color: var(--cover-meta); font-size: 0.85em; }
section.cover strong { color: white; }
/* ============================================================
4. LAYOUT: SECTION HEADER (pembatas topik)
============================================================ */
section.section-header {
background: var(--accent);
color: white;
justify-content: center;
text-align: center;
}
section.section-header h1 {
color: white;
font-size: 2.5em;
border-bottom: none;
}
section.section-header h2 { color: rgba(255, 255, 255, 0.85); }
/* ============================================================
5. LAYOUT: INTEGRATIVE (Bab 8 — UTS Checkpoint)
============================================================ */
section.integrative {
background: linear-gradient(135deg, #1e3a5f 0%, #7c3aed 100%);
color: white;
justify-content: center;
text-align: center;
}
section.integrative::before {
content: '';
display: block;
width: 90px;
height: 90px;
background: white url('theme/logo-upb.png') center / contain no-repeat;
border-radius: 8px;
padding: 6px;
margin: 0 auto 14px;
box-sizing: border-box;
}
section.integrative h1 {
color: white;
font-size: 2.2em;
border-bottom: none;
}
section.integrative h2 { color: #ddd6fe; font-size: 1.1em; }
section.integrative p { color: #c4b5fd; font-size: 0.85em; }
section.integrative strong { color: white; }
/* ============================================================
6. LAYOUT: FULLCIRCLE (Bab 16 — Penutup)
============================================================ */
section.fullcircle {
background: linear-gradient(135deg, #1e293b 0%, #1e3a5f 50%, #1e293b 100%);
color: white;
justify-content: center;
text-align: center;
}
section.fullcircle h1 {
color: #ddd6fe;
font-size: 2.2em;
border-bottom: 3px solid #7c3aed;
}
section.fullcircle blockquote {
border-left: 5px solid #7c3aed;
background: rgba(255, 255, 255, 0.08);
color: #ddd6fe;
font-style: italic;
}
/* ============================================================
7. ELEMEN KONTEN — menggunakan CSS vars dari bagian
============================================================ */
h1 {
color: var(--accent);
border-bottom: 3px solid var(--accent);
padding-bottom: 8px;
}
h2 { color: var(--accent-dark); font-size: 1.3em; }
h3 { color: var(--accent); font-size: 1.05em; }
blockquote {
border-left: 5px solid var(--accent);
background: var(--accent-light);
padding: 12px 20px;
margin: 16px 0;
color: var(--accent-dark);
font-style: italic;
border-radius: 0 8px 8px 0;
}
table { font-size: 0.82em; width: 100%; border-collapse: collapse; }
th { background: var(--accent); color: white; padding: 8px 12px; }
td { padding: 6px 12px; border-bottom: 1px solid var(--accent-border); }
tr:nth-child(even) td { background: var(--accent-light); }
code {
background: var(--accent-light);
color: var(--accent-dark);
padding: 2px 6px;
border-radius: 4px;
font-size: 0.9em;
}
pre {
background: #f1f5f9;
color: #1e293b;
padding: 16px;
border-radius: 8px;
border-left: 4px solid var(--accent-border);
}
ul li, ol li { margin-bottom: 6px; line-height: 1.6; }
/* ============================================================
8. HELPER CLASSES
============================================================ */
/* Status / penekanan */
.warn { color: #d97706; font-weight: bold; }
.good { color: #059669; font-weight: bold; }
.bad { color: #dc2626; font-weight: bold; }
/* Kotak pernyataan akhir */
.final {
background: #fef3c7;
border-left: 5px solid #d97706;
padding: 14px 20px;
border-radius: 0 8px 8px 0;
font-size: 1.1em;
}
/* Kotak highlight */
.highlight-box {
background: var(--accent);
color: white;
padding: 16px 20px;
border-radius: 8px;
margin: 12px 0;
}
/* Kotak checkpoint bab 8 */
.checkpoint {
background: #f5f3ff;
border: 2px solid #7c3aed;
border-radius: 8px;
padding: 16px 24px;
margin: 16px 0;
}
/* ============================================================
9. PAGINATION & HEADER/FOOTER
============================================================ */
section::after {
font-size: 0.7em;
color: #94a3b8;
}
section[data-marpit-advanced-background] > div:last-child { padding: 40px 60px; }
</style>
<!-- _class: cover bagian-iv -->
# Bab 13 — Data Preprocessing
## Mengubah Data Mentah menjadi Siap Analisis
*Pertemuan 13 (M13) &nbsp;|&nbsp; Sub-CPMK 4.2 &nbsp;|&nbsp; CPMK04 &nbsp;|&nbsp; CPL07*
Fase: **Analyzing & Communicating** (M12M16) &nbsp;·&nbsp; Bagian IV
**Universitas Putra Bangsa** &nbsp;|&nbsp; Fak. Sains & Teknologi &nbsp;·&nbsp; Prodi Teknik Informatika
---
## Agenda Pertemuan 13
1. Preprocessing vs Validasi — perbedaan yang kritis
2. Data Refinement Pipeline
3. Data Cleaning — missing values, duplikat, error
4. Data Transformation — encoding, agregasi, feature creation
5. Normalization & Scaling
6. Empat Prinsip Preprocessing
7. Cognitive Traps & Studi Kasus
8. Output Praktis: Dataset Bersih + Dokumentasi Preprocessing
---
## Capaian Pembelajaran
Setelah pertemuan ini, mahasiswa mampu:
- Membedakan **validasi data** (Bab 11) vs **preprocessing data** (Bab 13)
- Menerapkan teknik **data cleaning**: missing values, duplikat, outlier handling
- Melakukan **transformasi data** yang sesuai (encoding, normalisasi, agregasi)
- Mendokumentasikan setiap langkah preprocessing untuk **reproduksi**
- Menerapkan **4 prinsip preprocessing** agar tidak memperkenalkan bias
> Sub-CPMK 4.2 → Melakukan preprocessing data untuk analisis (CPL07)
---
## Data Refinement Pipeline
*Dari data tervalidasi hingga siap dianalisis*
<div class="highlight-box">
**Raw Data Tervalidasi** (Bab 11) &darr; Cleaning (hapus noise, missing, duplikat) &darr; Transformation (ubah format/representasi) &darr; Normalization (sesuaikan skala/distribusi) &darr; **Processed Data** &darr; Analysis Ready (input Bab 14)
</div>
> **Prinsip fundamental:** Preprocessing harus **dapat direproduksi** dan **terdokumentasi**. Tidak boleh ada langkah yang dilakukan tanpa jejak.
---
## Validasi vs Preprocessing — Garis yang Jelas
*Perbedaan yang sering tertukar*
| Aspek | Validasi (Bab 11) | Preprocessing (Bab 13) |
|-------|-----------------|----------------------|
| **Tujuan** | Memastikan data benar | Mempersiapkan data untuk analisis |
| **Pertanyaan** | "Apakah data ini valid?" | "Bagaimana mengoptimalkan data untuk analisis?" |
| **Tindakan** | Identifikasi masalah + keputusan | Transformasi + normalisasi |
| **Output** | Dataset valid + catatan anomali | Dataset siap analisis |
| **Urutan** | **Pertama** | **Kedua** (setelah validasi) |
> Jika preprocessing dilakukan sebelum validasi → kita mungkin "memperbaiki" data yang seharusnya diinvestigasi lebih lanjut.
---
## Data Cleaning — Tiga Masalah Utama
### 1. Missing Values
```python
# Identifikasi
print(df.isnull().sum())
# Strategi penanganan (pilih berdasarkan konteks):
# a. Drop baris jika jumlah kecil dan tidak sistemik
df.dropna(subset=['f1_micro'], inplace=True)
# b. Impute dengan mean (hanya untuk data kontinu, non-kritis)
df['time_sec'].fillna(df['time_sec'].mean(), inplace=True)
# c. Flag sebagai missing category (untuk data kategoris)
df['hardware'].fillna('unknown', inplace=True)
```
### 2. Duplikat (run yang ter-log dua kali)
```python
df.drop_duplicates(subset=['run_id'], keep='first', inplace=True)
```
### 3. Format Error (nilai "N/A" teks di kolom numerik)
```python
df['time_sec'] = pd.to_numeric(df['time_sec'], errors='coerce')
```
---
## Data Transformation
*Mengubah representasi data untuk memudahkan analisis*
### Encoding Variabel Kategoris
```python
# Label encoding (untuk variabel ordinal)
scenario_map = {'baseline': 0, '+attention': 1, '+ensemble': 2}
df['scenario_code'] = df['scenario'].map(scenario_map)
# One-hot encoding (untuk variabel nominal, tanpa urutan)
df_encoded = pd.get_dummies(df, columns=['hardware'])
```
### Agregasi
```python
# Hitung statistik per skenario (dari 10 run ke 1 baris per skenario)
summary = df.groupby('scenario').agg({
'f1_micro': ['mean', 'std', 'min', 'max'],
'time_sec': ['mean', 'std']
}).round(4)
```
---
## Normalization & Scaling
*Mengapa diperlukan dan kapan digunakan*
| Teknik | Formula | Kapan Digunakan |
|--------|---------|----------------|
| Min-Max Normalization | $x' = \frac{x - x_{min}}{x_{max} - x_{min}}$ | Ketika distribusi tidak diketahui, butuh skala [0,1] |
| Z-Score Standardization | $z = \frac{x - \mu}{\sigma}$ | Ketika data asumsi distribusi normal, ML yang sensitif skala |
| Log Transformation | $x' = \log(x)$ | Data sangat skewed (mis. execution time) |
| Robust Scaling | $x' = \frac{x - Q2}{Q3 - Q1}$ | Ada outlier yang tidak bisa dibuang |
> **Penting:** Fit scaler HANYA pada training data, apply ke test data. Jangan fit pada seluruh dataset → data leakage!
---
## Empat Prinsip Preprocessing
*Standar yang membedakan preprocessing ilmiah dari ad-hoc*
**1. Consistency** — Terapkan langkah yang sama persis ke semua skenario
```python
# Satu fungsi preprocessing, dipanggil untuk setiap skenario
def preprocess(df): return pipeline.fit_transform(df)
```
**2. Transparency** — Setiap langkah terdokumentasi dengan alasan
```
[STEP-01] Normalisasi waktu eksekusi dengan log transform.
Alasan: distribusi sangat right-skewed (skewness=4.3).
```
**3. Reproducibility** — Simpan pipeline sebagai kode, bukan manual
```python
joblib.dump(pipeline, 'preprocessing_pipeline.pkl')
```
**4. Minimal Distortion** — Jangan hapus karakteristik penting dari data
> Normalisasi boleh mengubah skala, tidak boleh mengubah urutan atau hubungan relatif.
---
<!-- _class: section-header bagian-iv -->
# Cognitive Traps
## Bab 13 — Data Preprocessing
---
## Cognitive Traps — Bab 13
**"Preprocessing dilakukan dulu sebelum memahami data"**
Preprocessing tanpa pemahaman konteks dapat memperkenalkan bias yang tidak disadari. Selalu lakukan exploratory analysis terlebih dahulu, baru tentukan preprocessing yang tepat.
**"Normalisasi selalu diperlukan"**
Beberapa algoritma (decision tree, random forest) tidak sensitif terhadap skala. Normalisasi tidak selalu meningkatkan performa. Pilih berdasarkan kebutuhan algoritma, bukan habit.
**"Data leakage tidak masalah karena kita tahu hasilnya"**
Data leakage (fit scaler pada entire dataset termasuk test) adalah invalidasi hasil yang fundamental. Hasil yang tinggi karena data leakage bukan prestasi — itu artifact.
**"Langkah preprocessing tidak perlu didokumentasikan satu per satu"**
Jika preprocessing tidak terdokumentasi, penelitian tidak dapat direproduksi. Penelitian yang tidak dapat direproduksi tidak dapat diverifikasi. Ini adalah standar minimum publikasi ilmiah.
---
## Studi Kasus 1 — Data Leakage (Basic)
**Konteks:** Mahasiswa melakukan normalisasi sebelum train/test split.
```python
# Data Leakage
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X) # Fit pada SEMUA data, termasuk test!
X_train, X_test = train_test_split(X_scaled, test_size=0.2)
# Masalah: scaler telah "melihat" test data → test data tidak independen lagi
```
```python
# Benar
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
X_train, X_test = train_test_split(X, test_size=0.2)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train) # Fit HANYA pada training
X_test_scaled = scaler.transform(X_test) # Transform saja test data
```
---
## Studi Kasus 2 — Inconsistent Preprocessing (Advanced)
**Konteks:** Researcher membandingkan Model A dan Model B.
**Masalah:**
- Model A: preprocessing dengan normalisasi Min-Max
- Model B: preprocessing dengan standardisasi Z-score
- Kedua model dibandingkan sebagai "perbandingan arsitektur"
**Tapi preprocessing yang berbeda = bukan hanya arsitektur yang berbeda!**
Model B mungkin lebih baik bukan karena arsitekturnya, tapi karena Z-score lebih cocok dengan distribusi data ini.
**Solusi:**
1. Tentukan satu pipeline preprocessing yang sama untuk semua model yang dibandingkan
2. Jika ingin membandingkan teknik preprocessing → buat itu sebagai variabel eksperimen eksplisit
---
## Dokumentasi Preprocessing
*Format standar yang harus ada dalam laporan*
```
PREPROCESSING DOCUMENTATION
Dataset: exp03_summary_validated.csv (setelah validasi Bab 11)
Tanggal preprocessing: 2026-05-05
Peneliti: [Nama]
LANGKAH PREPROCESSING:
[STEP-01] Log transform pada kolom 'time_sec'
Alasan: skewness = 4.3 (right-skewed). Log transform menurunkan ke 0.8.
Kode: df['time_sec'] = np.log1p(df['time_sec'])
[STEP-02] One-hot encoding pada kolom 'scenario'
Alasan: variabel nominal (tidak ada urutan).
Output: 3 kolom dummy (attendance_baseline, +attention, +ensemble)
[STEP-03] Min-Max normalization pada seluruh fitur numerik
Alasan: SVM dan neural network sensitif terhadap skala.
Penting: Scaler di-fit HANYA pada training fold (cross-validation).
OUTPUT: exp03_processed.csv + preprocessing_pipeline.pkl
```
---
## Ringkasan Pertemuan 13
| Konsep | Inti |
|--------|------|
| Preprocessing vs Validasi | Validasi dulu (cek kebenaran), preprocessing kemudian (siapkan analisis) |
| Cleaning | Missing (drop/impute/flag) + Duplikat + Format Error |
| Transformation | Encoding kategoris + Agregasi runs → per-skenario |
| Normalization | Min-Max/Z-score/Log/Robust sesuai konteks + hindari data leakage |
| 4 Prinsip | Consistency · Transparency · Reproducibility · Minimal Distortion |
---
## Final Statement & Output Praktis
<div class="final">
"Preprocessing yang tidak terdokumentasi adalah black box — tidak ada yang bisa memverifikasi apakah transformasi yang dilakukan valid atau tidak, termasuk penelitinya sendiri setelah 6 bulan."
</div>
### Output Praktis M13
Kumpulkan:
1. **Dataset bersih** (`exp_processed.csv` — siap untuk analisis)
2. **Preprocessing pipeline** (kode Python / file `.pkl`)
3. **Dokumentasi preprocessing** (format [STEP-XX] lengkap dengan alasan)
---
## Referensi Utama — Bab 13
- Famili, A., Shen, W. M., Weber, R., & Simoudis, E. (1997). Data preprocessing and intelligent data analysis. *Intelligent Data Analysis, 1*(1), 323.
- Garcia, S., Luengo, J., & Herrera, F. (2015). *Data preprocessing in data mining*. Springer.
- Kaufman, S., Rosset, S., Perlich, C., & Stitelman, O. (2012). Leakage in data mining: Formulation, detection, and avoidance. *ACM Transactions on Knowledge Discovery from Data, 6*(4), 121.
- Géron, A. (2022). *Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow* (3rd ed.). O'Reilly Media.