- Claim
- Knowledge work is bottlenecked by retrieval, not generation; the right tool extends memory without replacing thinking.
- Share
- Augmentation framing — tools extend, do not replace.
- Diff
- Bush imagined hardware; we instrument the loop between human and AI as the measurable unit.
What surrounds this work, and what is empty.
Curated map of academic research, industry voices, top products, and individual essays and podcasts that surround the NOUS OS research line. Each entry has the same four lines — what it is, the core claim, where we share, where we differ and what we add — plus a status flag for our depth of engagement.
Six buckets.
Each bucket is an angle the research line connects to. Move sub-line forward when evidence accumulates, not when arguments accumulate.
The lineage we walk in.
Pre-LLM thinkers who already articulated the augmentation vs replacement question. Their ground is firm; our addition is the LLM-era boundary taxonomy and the capability-without-AI instrument.
- Claim
- Capability is a coupling of human, language, tool, training; the right design loop raises civilization's H-LAM/T capability.
- Share
- Explicit "human at the center" language; system-level thinking.
- Diff
- Did not live to confront LLM misalignment; our boundary taxonomy + capability-delta instrument are LLM-era additions.
- Claim
- Children learn most when they build; computers should enable construction, not deliver answers.
- Share
- Student-as-builder framing; Sandbox is constructionism re-instantiated with AI.
- Diff
- Papert used LOGO + minimal scaffolding; we add explicit boundary phases because LLM output is confident in ways LOGO never was.
- Claim
- Learning happens in the zone between what a learner can do alone and what they can do with help; scaffolding withdraws as competence grows.
- Share
- The Sandbox loop is dynamic AI-as-scaffold in the ZPD.
- Diff
- We make the withdrawal of scaffolding measurable via the capability-without-AI delta.
- Claim
- Most productive computing will be symbiosis between human and machine, not full automation.
- Share
- The literal word symbiosis — same lineage.
- Diff
- Licklider's symbiosis was speculative; ours is operational with measurable boundary integrity.
The reverse front.
Empirical and theoretical work showing offloading cognition onto tools changes what's encoded internally. NOUS OS sits in the literature gap that asks the inverse: under what conditions does offloading make people stronger?
- Claim
- Humans systematically offload memory, calculation, decision; this changes internal cognition, often costing native capability.
- Share
- The descriptive framing — offloading happens, does cost.
- Diff
- The core NOUS OS question is the reverse: under what conditions does offloading make humans stronger?
- Claim
- People remember less when they know they can re-look-up; access to external info changes encoding.
- Share
- Validates that the offloading concern is empirically real.
- Diff
- They did not test interventions that flip the effect. We do.
- Claim
- Externalizing memory can free or impoverish the internal trace, depending on conditions.
- Share
- "Depending on conditions" — exactly the conditional we want to characterize.
- Diff
- We move from file-saving to AI-mediated learning; conditions become design parameters of a scaffold.
- Claim
- Effective learners cycle forethought → performance → self-reflection; SRL is teachable.
- Share
- The Sandbox 6-phase loop is an AI-native SRL instantiation; reflection card maps to SRL's self-reflection phase.
- Diff
- SRL is silent on what happens when a confident AI is in the middle. Our boundary phases are an AI-era addition.
- Claim
- Intrinsic / extraneous / germane load determines what is learned; design that minimizes extraneous load wins.
- Share
- Our 20-min / 6-phase / 3-4 min-per-phase structure is implicitly CLT-respectful.
- Diff
- We should explicitly cite CLT going forward; current docs do not.
The contemporary peers.
Same audience, same era, but mostly descriptive taxonomies and policy frameworks. We sit downstream as an instrumented loop with measured outcomes.
- Claim
- 16-competency framework for AI literacy across understand / use / evaluate / ethics dimensions.
- Share
- Same target audience; their critical-evaluation competency maps to Sandbox phase 4.
- Diff
- Descriptive taxonomy vs prescriptive 20-min protocol with measurable outcomes.
- Note
- 2026-05-17-long-magerko-2020
- Claim
- International policy framework — four dimensions × twelve competencies students should develop around AI.
- Share
- Student-facing focus, public-interest framing.
- Diff
- UNESCO writes standards; we run experiments. They consume evidence; we produce (small-N) evidence.
- Claim
- AI should be developed alongside humans, not at them; CRAFT classroom materials.
- Share
- "Human-centered" framing.
- Diff
- Publishing institution vs instrumented practice. Worth importing CRAFT lessons.
- Claim
- Learning compounds best when projects, peers, passion, play are present; AI partners can support these if authorship is preserved.
- Share
- The "preserve authorship" line is identical to ours.
- Diff
- They focus on creative-project authorship; we focus on research-loop authorship.
- Claim
- Specific prompt designs improve learning outcomes (e.g., "ask AI to ask you questions first").
- Share
- We are downstream practitioners of these patterns.
- Diff
- Most measure satisfaction or correctness; we measure capability without AI, which is rare.
Same problem from the lab side.
Labs publishing on alignment, scalable oversight, and model-side boundary design. Their evidence is on AI capability; ours is on human capability when AI is present.
- Claim
- Structured human feedback + explicit constitutional rules produce better-aligned model behavior.
- Share
- The explicit-boundary design philosophy.
- Diff
- Anthropic's boundaries are model-side; ours are interaction-side. Complementary halves of a symbiosis.
- Claim
- AI can augment scientific discovery; scalable oversight is solvable.
- Share
- The augmentation thesis.
- Diff
- DeepMind's evidence is on AI capability; ours is on human capability when AI is present.
- Claim
- Deployment studies of GPTs as tutors, study helpers, etc.
- Share
- Practical "AI in education" focus.
- Diff
- They measure usage + satisfaction; we measure human capability delta.
- Claim
- AI safety requires AI that explicitly defers to humans and remains uncertain about goals.
- Share
- Responsibility-stays-with-human stance.
- Diff
- Russell is at alignment-policy layer; we are at daily-interaction-design layer. His policy implies our interaction design.
- Claim
- If alignment is solved, the next decade could see profound gains in health, science, education, freedom.
- Share
- The optimistic-but-conditional vision.
- Diff
- Amodei describes the destination; we propose the per-session protocol that makes a human less likely to lose capability on the way there.
Same-shape attempts in production.
Commercial systems instantiating pieces of our protocol — sometimes one phase, sometimes a sibling philosophy. Useful benchmarks and design references; we are product-agnostic by design.
- Claim
- Socratic AI tutoring at scale with safety rails for under-18 users.
- Share
- Socratic / hints-not-answers stance.
- Diff
- Khanmigo is the AI; Sandbox is a protocol that wraps any AI. Closest commercial cousin to study.
- Claim
- Source-grounded AI is more trustworthy and useful; answers cite source passages.
- Share
- Phase 4 (source check) directly aligned with NotebookLM's source-grounding ethos.
- Diff
- NotebookLM optimizes one phase; we structure the full 6-phase loop including boundary + reflection.
- Note
- 2026-05-17-notebooklm-product-walk
- Claim
- Developer productivity rises when AI suggestions are explicitly accept-or-reject rather than silently applied.
- Share
- The explicit accept-or-reject boundary — directly analogous to our boundary phase.
- Diff
- Cursor proves the pattern in the code domain; we propose the same in research/learning.
- Claim
- Agents can do real engineering work safely if boundaries — permissions, plan mode, transparent diffs — are explicit.
- Share
- Every key design choice — explicit permissions, plan mode, transparent diffs.
- Diff
- Claude Code is the engineering instantiation of NOUS OS principles; Sandbox is the learning instantiation.
- Claim
- Agents can do longer-horizon tasks unattended.
- Share
- Shared agentic substrate.
- Diff
- Autonomy-maximizing vs symbiosis-maximizing. Useful contrast points.
- Claim
- Drafting is cheap; human curation is the value-add.
- Share
- Human-curates-before-commits pattern.
- Diff
- They optimize a workflow product; we extract the pattern and codify it as a loop principle.
- Claim
- AI search with cited sources beats non-cited AI chat.
- Share
- Source primacy.
- Diff
- They optimize answer-with-citations as the whole interaction; we make source-check one explicitly-bounded phase.
The living thinkers.
Essays, podcasts, and contemporary writing that move faster than papers. The L1 capture cron will pull selectively from these.
- Claim
- Most "tools for thought" are not transformative; transformative ones change the medium of thought itself.
- Share
- Medium-changes-thought framing.
- Diff
- Individual cognition vs explicit symbiosis with boundary taxonomy. His vocabulary will inform L3.
- Note
- 2026-05-17-matuschak-nielsen-2019
- Claim
- New mediums of representation enable new kinds of thought.
- Share
- Representation-as-cognitive-substrate stance.
- Diff
- His essays are inspirational and largely ungrounded in trials; we run trials.
- Claim
- Thought is constrained by representation; better representation enables better thought.
- Share
- Deep belief in representation-as-cognition.
- Diff
- Victor's work is largely demonstrative; we add empirical loops.
- Claim
- Treat AI as a co-worker; experiment widely; expect rapid capability shifts.
- Share
- Practical-empirical attitude.
- Diff
- Descriptive + adult-facing; ours is prescriptive + instrumented + student-facing.
- Claim
- Deep cognitive work requires uninterrupted attention; modern tools fight against this.
- Share
- The concern that AI use without design can shred attention.
- Diff
- His stance leans minimalist; we propose scaffolded AI use can produce deep work, not destroy it.
- Claim
- Long-form interviews with top AI researchers — N/A; format.
- Share
- Access to current AI-research thinking before it surfaces in papers.
- Diff
- Consumer relationship — pick 1-2 episodes per quarter for inbound notes.
- Claim
- Mainstream public-interest podcast with frequent AI episodes (Karen Hao, Demis Hassabis, Dario Amodei).
- Share
- Connection to broader societal AI discourse.
- Diff
- Mainstream-discourse-shaping; we feed our research-line writing back into that discourse over time.
- Claim
- Economist's takes on AI's distributional + cognitive impact, often contrarian and useful.
- Share
- Asking "what does this do to people" rather than "what can it do".
- Diff
- Macro-economic vs individual-cognitive lens.
- Claim
- AI is a platform shift; strategy logic from prior platform shifts applies.
- Share
- Product-strategy literacy informs our "what is happening in product land" tracking.
- Diff
- Product-strategy-focused vs research-method-focused.
- Claim
- Practitioner format — N/A.
- Share
- Stays current on what's deployable.
- Diff
- Principles-and-evidence layer vs deployable-engineering layer.
- Claim
- Interview format — N/A.
- Share
- Broad-public AI discourse.
- Diff
- Signal-to-noise variable; pick selectively.
- Claim
- Interview / journalism format.
- Share
- Access to Chinese AI ecosystem thinking, often missing from Western feeds.
- Diff
- L1 cron cannot reach (no clean RSS); manual L2 promotion for Chinese sources.
Where we actually stand.
Empirical, instrumented, public protocols that measure whether a human becomes more capable in AI's absence after AI-assisted work.
Most cognitive-offloading work measures the negative case. Most AI-literacy work writes taxonomies. Most product studies measure usage and satisfaction. Most industry research measures the AI. Most individual essays speculate. The protocol-with-measured-capability-delta position is largely empty. That is where this research line lives.
How this atlas stays alive.
An atlas is not a reading list. It is the cumulative record of what we have engaged with, how engaged we are, and where we differ.
- L1 capture (daily). Scheduled remote agent pulls 5–8 narrow high-signal sources, filters by anchor keywords, writes raw daily inbox.
- L2 triage (weekly). Scheduled agent + human approval promotes 1–3 candidates per week to full 1-page inbound notes. Each note has an HTML mirror and joins the public corpus.
- L3 synthesis (quarterly). Human + Claude write a synthesis: "what we read, what we changed because of it."
- Atlas update. New inbound notes are appended to the appropriate bucket above with status flipped to
note-writtenand a link to the inbound markdown.