cheat sheet
notebooklm-py
Automate Google NotebookLM from Python with the unofficial notebooklm-py library. Covers authentication, notebook and source management, summaries, FAQ generation, and audio podcast creation.
notebooklm-py — Unofficial NotebookLM Client
What it is
notebooklm-py is an unofficial, community-maintained Python client that automates NotebookLM — Google's AI-powered note-taking and research tool. It reverse-engineers the internal NotebookLM web API to allow programmatic notebook management, source ingestion, summary generation, and audio podcast creation from Python scripts or CI pipelines.
This library is unofficial and unsupported by Google. It depends on internal NotebookLM web APIs that can change or break without notice. Do not use it for production systems or critical workflows. Google has no obligation to maintain API compatibility. Check the project's GitHub issues before upgrading.
Terms of Service — automated access to NotebookLM may violate Google's Terms of Service for the product. Use only for personal research, evaluation, or educational purposes. Do not use at scale or in ways that impose significant load on Google's servers.
Install
pip install notebooklm-py
Output: (none — exits 0 on success)
Authentication
notebooklm-py authenticates using Google account cookies extracted from a logged-in browser session. The most reliable method uses a browser extension like Cookie-Editor to export cookies in Netscape/JSON format.
from notebooklm import NotebookLM
# Method 1 — from a cookies.json file (exported from browser)
client = NotebookLM.from_cookies_file("cookies.json")
# Method 2 — from a cookies string (Netscape format)
with open("cookies.txt") as f:
cookies_text = f.read()
client = NotebookLM.from_cookies(cookies_text)
Cookies expire. Re-export and refresh your
cookies.jsonwhenever you see401 Unauthorizederrors. Log into NotebookLM in your browser, export fresh cookies, and update the file.
Store
cookies.jsonsecurely — it grants full access to your Google account. Never commit it to version control. Add it to.gitignoreand use environment variables or a secrets manager in CI/CD.
Listing notebooks
Once authenticated, retrieve all notebooks in the account.
from notebooklm import NotebookLM
client = NotebookLM.from_cookies_file("cookies.json")
notebooks = client.get_notebooks()
for nb in notebooks:
print(nb.id, nb.title, nb.created_at)
Output:
nb_abc123 Research Notes 2026 2026-03-15T10:22:00Z
nb_def456 Product Launch Planning 2026-04-01T08:00:00Z
nb_ghi789 Customer Interview Themes 2026-04-10T14:33:00Z
Creating a notebook
from notebooklm import NotebookLM
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.create_notebook(title="Python AI Libraries Overview")
print(f"Created: {notebook.id} — {notebook.title}")
Output:
Created: nb_xyz999 — Python AI Libraries Overview
Adding sources
Sources are the grounding documents NotebookLM uses to answer questions and generate content. You can add URLs, uploaded PDFs, or plain text.
from notebooklm import NotebookLM
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.get_notebook("nb_xyz999")
# Add a URL source
source_url = notebook.add_source_url("https://example.com/blog/transformer-overview")
print(f"URL source added: {source_url.id}")
# Add a PDF file
with open("paper.pdf", "rb") as f:
source_pdf = notebook.add_source_file(f, filename="paper.pdf")
print(f"PDF source added: {source_pdf.id}")
# Add plain text
source_text = notebook.add_source_text(
"Attention is all you need. Transformers replaced RNNs for sequence modelling.",
title="Transformer Note",
)
print(f"Text source added: {source_text.id}")
Output:
URL source added: src_aaa111
PDF source added: src_bbb222
Text source added: src_ccc333
Listing and deleting sources
from notebooklm import NotebookLM
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.get_notebook("nb_xyz999")
sources = notebook.get_sources()
for src in sources:
print(src.id, src.title, src.source_type)
# Delete a source by ID
notebook.delete_source("src_aaa111")
print("Source deleted")
Output:
src_aaa111 Transformer Overview (URL) url
src_bbb222 paper.pdf pdf
src_ccc333 Transformer Note text
Source deleted
Generating a notebook guide (summary)
The notebook guide produces a structured summary with key topics, important quotes, and a study guide based on all sources in the notebook.
from notebooklm import NotebookLM
import time
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.get_notebook("nb_xyz999")
# Trigger guide generation (async — poll for completion)
job = notebook.generate_guide()
print(f"Guide job started: {job.id}")
# Poll until complete (may take 30–120 seconds)
while not job.is_complete():
time.sleep(5)
job.refresh()
print(f" Status: {job.status}")
guide = job.result()
print("\n--- Summary ---")
print(guide.summary[:400])
print("\n--- Key Topics ---")
for topic in guide.topics[:3]:
print(f"• {topic}")
Output:
Guide job started: job_guide_001
Status: processing
Status: processing
Status: complete
--- Summary ---
This notebook covers the evolution of neural network architectures for
natural language processing, focusing on the transformer model introduced
in "Attention is All You Need" (Vaswani et al., 2017). Sources discuss
self-attention mechanisms, positional encoding, and the shift away from
recurrent models...
--- Key Topics ---
• Self-attention and multi-head attention mechanisms
• Encoder-decoder architecture in sequence-to-sequence tasks
• Comparison of transformers vs LSTM for long-range dependencies
Generating FAQ
The FAQ endpoint produces a list of question-answer pairs derived from the notebook's sources.
from notebooklm import NotebookLM
import time
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.get_notebook("nb_xyz999")
job = notebook.generate_faq()
while not job.is_complete():
time.sleep(5)
job.refresh()
faq = job.result()
for item in faq.items[:3]:
print(f"Q: {item.question}")
print(f"A: {item.answer[:120]}")
print()
Output:
Q: What problem do transformers solve that RNNs could not?
A: Transformers process all tokens in parallel using self-attention, eliminating the sequential
bottleneck that prevented RNNs from capturing long-range dependencies efficiently.
Q: What is positional encoding and why is it necessary?
A: Since transformers have no inherent sense of token order, positional encoding injects
position information into the embeddings using sinusoidal functions.
Q: How does multi-head attention differ from single-head attention?
A: Multi-head attention runs several attention operations in parallel, allowing the model to
jointly attend to information from different representation subspaces.
Generating audio podcast
NotebookLM's signature feature is a conversational two-host audio podcast based on the notebook's sources. The podcast endpoint generates an MP3 file.
from notebooklm import NotebookLM
import time
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.get_notebook("nb_xyz999")
# Trigger podcast generation (takes 2–10 minutes for a full episode)
job = notebook.generate_audio()
print(f"Audio job started: {job.id}")
while not job.is_complete():
time.sleep(15)
job.refresh()
print(f" Status: {job.status}")
audio = job.result()
print(f"Duration: {audio.duration_seconds}s")
# Download to local file
audio.download("podcast_transformers.mp3")
print("Saved: podcast_transformers.mp3")
Output:
Audio job started: job_audio_002
Status: processing
Status: processing
Status: complete
Duration: 1842s
Saved: podcast_transformers.mp3
Audio generation is the most resource-intensive operation and frequently triggers rate limits. Space out podcast generation requests by at least 10–15 minutes between notebooks.
Asking questions (chat)
from notebooklm import NotebookLM
client = NotebookLM.from_cookies_file("cookies.json")
notebook = client.get_notebook("nb_xyz999")
answer = notebook.ask("What are the main limitations of the transformer architecture?")
print(answer.text)
print("\nCitations:")
for citation in answer.citations[:2]:
print(f" [{citation.source_title}] — {citation.excerpt[:80]}...")
Output:
The primary limitations include quadratic memory complexity with sequence length,
requiring techniques like sparse attention or sliding windows for very long contexts,
and high compute costs during pre-training.
Citations:
[Transformer Overview] — "The self-attention mechanism scales as O(n²) in both time...
[paper.pdf] — "Long-sequence transformers require architectural modifications such as...
Error handling and rate limits
from notebooklm import NotebookLM
from notebooklm.exceptions import (
AuthenticationError,
RateLimitError,
NotebookLMError,
)
import time
client = NotebookLM.from_cookies_file("cookies.json")
try:
notebook = client.get_notebook("nb_xyz999")
answer = notebook.ask("Summarise the main findings.")
print(answer.text)
except AuthenticationError:
print("Cookies expired — re-export from browser and update cookies.json")
except RateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
time.sleep(e.retry_after)
except NotebookLMError as e:
print(f"API error: {e}")
Full pipeline example
from notebooklm import NotebookLM
import time
client = NotebookLM.from_cookies_file("cookies.json")
# 1. Create notebook
nb = client.create_notebook(title="AI Safety Research Digest")
# 2. Add sources
nb.add_source_url("https://example.com/ai-safety-overview")
nb.add_source_url("https://example.com/alignment-techniques")
with open("safety_paper.pdf", "rb") as f:
nb.add_source_file(f, filename="safety_paper.pdf")
print(f"Sources added: {len(nb.get_sources())}")
# 3. Generate guide
guide_job = nb.generate_guide()
while not guide_job.is_complete():
time.sleep(10)
guide_job.refresh()
print("Guide complete")
# 4. Generate podcast
audio_job = nb.generate_audio()
while not audio_job.is_complete():
time.sleep(20)
audio_job.refresh()
audio_job.result().download("ai_safety_digest.mp3")
print("Podcast downloaded")
Alternatives and when to avoid notebooklm-py
| Scenario | Better option |
|---|---|
| Production automation | Google's official Vertex AI or Gemini API |
| Stable summarisation | google-generativeai with uploaded PDFs |
| Audio generation at scale | ElevenLabs, Azure TTS, or Google TTS |
| Long-context Q&A | Gemini 1.5 Pro via google-generativeai directly |
| The unofficial API is broken | Wait for or contribute a fix on the project's GitHub |
Quick reference
| Task | Code |
|---|---|
| Authenticate | NotebookLM.from_cookies_file("cookies.json") |
| List notebooks | client.get_notebooks() |
| Create notebook | client.create_notebook(title="Name") |
| Get notebook | client.get_notebook("nb_id") |
| Add URL source | notebook.add_source_url("https://...") |
| Add PDF source | notebook.add_source_file(file_obj, filename="f.pdf") |
| Add text source | notebook.add_source_text("...", title="Name") |
| List sources | notebook.get_sources() |
| Generate summary | job = notebook.generate_guide() then job.result() |
| Generate FAQ | job = notebook.generate_faq() then job.result() |
| Generate podcast | job = notebook.generate_audio() then job.result().download("f.mp3") |
| Ask question | notebook.ask("question") |
| Poll job | while not job.is_complete(): time.sleep(5); job.refresh() |