gif 1 gif 2 gif 3 gif 4
🌹 πŸ’‹ ✦ πŸ’Š πŸŽ€ β˜… 🩹 πŸ’‰ 🌸 ✿ πŸ¦‹ πŸ’€ β˜… πŸŽ€
πŸ’‹ welcome to mediglitterdiaries πŸ’‹   ✦ best viewed at 1024px+ ✦   πŸ’Š my personal corner of the internet πŸ’Š   ✦ please don't steal my graphics ✦   🩹 sign the guestbook pls !! 🩹   ✦ glitter is a way of life ✦   πŸ’‹ welcome to mediglitterdiaries πŸ’‹  
β€” est. 2026 Β· personal diary Β· neocities β€”

MediGlitterDiaries

πŸ’‹ a chaotic little web diary full of glitter & weird feelings πŸ’‹
🩺
β€” mediglitter β€” she/her Β· artist Β· goblin
πŸ’‰ site vitals
mood
musicparamore
readingold diary
fixatedlab coats
updatedtoday !!
πŸ—“οΈ diary log
  • mar 28site overhaul done !!
  • mar 15added shrine pg
  • feb 27new oc uploaded
  • feb 10fixed the css (rip)
πŸ““ THE MEDIGLITTER DIARIES

⚑ Got Sick of Copy-Pasting, So I Built an Automated ETL Study Pipeline⚑

πŸ“ The "Elevator Pitch"

If you use active recall to study, you know the struggle. I was spending hours manually copying text from my physics PDFs, pasting it into an AI, begging the AI to format it right, and then copying the results into my Obsidian vault. It was a tedious, mind-numbing loop.

So, I decided to stop doing it. I built a custom Python pipeline that reads my textbooks, extracts high-yield concepts using the Gemini API, formats them as perfect Markdown toggles, and injects them straight into my local Obsidian vault. Now, I just run a script, sit back, and let the machine build my flashcards while I chill.


πŸ› οΈ The Tech Stack

  • The Brain: Google Gemini-3.1-flash-lite API (Configured with a temperature of 0.0 for strict, zero-fluff data extraction).
  • The Engine: Python (Using the pypdf library to loop through textbook pages chronologically).
  • The Database: Obsidian (Receiving raw, perfectly formatted Markdown toggles directly into the local .md files).

βš™οΈ How It Works

I structured this as a classic ETL (Extract, Transform, Load) pipeline:

  1. Extract: The script opens my textbook PDF and reads it page by page, automatically skipping blank pages and ignoring formatting fluff.
  2. Transform: It feeds the raw text to Gemini with a highly specific "Negative Prompt" (telling it exactly what not to do, like ignoring page numbers and historical trivia) and forces it to output strict active-recall toggles.
  3. Load: Using a retry loop to bypass API speed limits safely, Python automatically appends the generated questions to my Obsidian vault.

🎯 Why It Matters

"There are dozens of 'AI Flashcard' apps out there charging $15 to $20 a month. By using Python and Google's free API tier, I completely bypassed the paywalls."

More importantly, I have total control over the output. If the AI misses a concept, I don't have to wait for an app updateβ€”I just tweak my prompt and run it again. Learning to automate my own workflow was infinitely more rewarding than just paying for another subscription.

πŸ’» The Script

Here is the core logic of the pipeline. It’s designed to be lightweight and efficient:


from google import genai
from pypdf import PdfReader
import time

# 1. Setup the new 2026 Client
# Put your API key here
client = genai.Client(api_key="blablablablablabla")

# The new high-speed workhorse model for 2026
current_model = 'gemini-3.1-flash-lite-preview'

pdf_file_path = r"blablablablabla"
obsidian_file_path = r"blablablablablablabla.md"

print(f"Opening PDF... Using {current_model}")
reader = PdfReader(pdf_file_path)

with open(obsidian_file_path, "a", encoding="utf-8") as file:
    
    # Starting from page 1
    for i in range(0, len(reader.pages)):
        print(f"Reading page {i + 1}...")
        page_text = reader.pages[i].extract_text()
        
        if not page_text or len(page_text.strip()) < 50:
            continue

        prompt = f"""
        You are a strict exam tutor preparing a student for high-level exams 
        Read the provided text and extract ONLY the hard scientific concepts. 

        CRITICAL 'DO NOT' RULES:
        - NO page numbers, chapter titles, or headers.
        - NO historical trivia or dates.
        - NO conversational filler.

        CRITICAL 'MUST DO' RULES:
        - Extract definitions, laws, principles, and postulates.
        - Extract the "Why" and "How" behind phenomena.
        - Format as:
        - Question?
            - Answer.

        TEXT:
        {page_text}
        """

        success = False
        while not success:
            try:
                print(f"Asking AI for page {i + 1}...")
                # The new 2026 way to call the AI
                response = client.models.generate_content(
                    model=current_model,
                    contents=prompt
                )
                
                file.write(f"\n\n### Questions from Page {i + 1}\n")
                file.write(response.text.strip())
                file.flush() # Force saves to your vault immediately
                
                print(f"Page {i + 1} saved! Pausing briefly...")
                time.sleep(5) 
                success = True
                
            except Exception as e:
                print(f"Waiting for a minute... Error: {e}")
                time.sleep(60)

print("Success! Your new chapter is ready in Obsidian.")
# Run it