Skip to content
OniraOnira

AI Script Engine

Onira's Script Engine writes structured documentary scripts with 60-80 scenes - each with camera angles, mood, pacing, and visual descriptions. Not a template fill. A director.

The Differentiation

Not a GPT template

Most AI video tools give GPT a prompt and ask it to “write a script.” What comes back is a wall of narration text - no scenes, no structure, no production direction. You then have to manually break it into clips, choose shots, and figure out pacing yourself.

Onira's Script Engine is an orchestration layer that runs above the language model. Before a single word of narration is written, the engine determines the four-act narrative structure for your specific topic, decides how many scenes each act needs, what type of visual each scene requires, and what the emotional trajectory looks like across the full runtime.

The language model is one tool in the pipeline. The engine is the director.

Generic AI video tools

  • Paste prompt → get narration text
  • No scene boundaries or shot directions
  • No narrative arc or structure awareness
  • Same template regardless of topic
  • Manual effort required to produce anything

Onira Script Engine

  • Analyse topic → build narrative architecture
  • 60-80 scenes, each a complete production brief
  • Hook → Rising Action → Revelation → Denouement
  • Adapted to your topic, length, and target audience
  • Ready for direct handoff to visual generation

Narrative Architecture

Documentary narrative structure

The engine understands story. It doesn't just write scenes - it sequences them in a four-act structure borrowed from broadcast documentary filmmaking. Hook, rising action, revelation, denouement. Every script has a spine.

Hook

0:00 – 1:308-12 scenes

Opens with the most visually striking or emotionally charged moment. The engine identifies the single most arresting fact, image, or question within your topic and leads with it. Designed to prevent early drop-off by making the viewer feel they cannot leave yet.

Dramatic opening visual - maximum visual impact
Provocative question or unresolved statement
Quick montage of what's to come
Narrator establishes the premise and stakes

Rising Action

1:30 – 5:0020-30 scenes

Builds the knowledge foundation the audience needs to appreciate the revelation. The engine introduces key concepts, characters, historical context, or scientific principles in ascending order of complexity, ensuring each scene earns the next.

Context and background established
Key subjects, characters, or concepts introduced
Central conflict or challenge made concrete
Evidence, data, and supporting visuals

Revelation

5:00 – 8:0015-20 scenes

The climax where the central thesis is revealed, the mystery resolved, or the discovery made. The engine recognises which element of the topic carries the most surprise or significance and structures the preceding acts to maximise its impact when it lands.

Central revelation or discovery sequence
Expert insight or decisive evidence
Emotional peak - music and pacing aligned
Visual crescendo; highest production value scenes

Denouement

8:00 – 10:0010-15 scenes

Resolution, reflection, and forward momentum. The engine wraps narrative threads, articulates the broader implications, and closes on a final image or statement designed to linger. Endings that are earned, not merely concluded.

Resolution of the central conflict or question
Broader implications and context
Call to reflection or action
Memorable closing visual - a director's final frame

Narrative Tension Arc - 10 minute documentary

HookRising ActionRevelationDenouement

Scene-Level Intelligence

Every scene is a production brief

Each of the 60-80 scenes the engine produces is a complete, self-contained production directive. Nothing is left for the user to invent. The visual generation models receive exactly what they need to produce the intended shot.

Setting

Physical location and environmental context for the visual generator

Camera Angle

Shot type, angle, and movement - dolly, tracking, static, aerial, macro

Mood / Tone

Emotional register that guides lighting, music, and color grading

Pacing

Cut rhythm, hold duration, and transition energy for the scene

Duration

Precise scene length in seconds, synced to the narration audio

Transition

Cut type into the next scene - hard cut, dissolve, J-cut, L-cut

Visual Prompt

Complete prompt sent directly to the image or video generation model

Narration

Voice-over text read by the AI narrator, timed to within 0.1 seconds

Why this matters

Image and video generation models are highly sensitive to prompt quality. A vague prompt produces a generic frame. A precise, production-aware prompt - with mood, camera angle, setting, and context - produces something that fits exactly within a coherent visual story. The Script Engine exists to write those precise prompts automatically, for every single scene, in the right order, with the right emotional progression.

From Weak Prompt to Strong Script

“a documentary about the deep ocean”

That is the complete user input. Here is what the Script Engine produces - three scenes from a full 72-scene screenplay, each ready for immediate handoff to video generation.

User input

>a documentary about the deep ocean
Script Engine processing - building narrative structure for 72 scenes...
Scene 039.4sDissolve to black

Setting

Open ocean surface at night, bioluminescent plankton

Camera

Low aerial, wide shot, slow drift eastward

Mood

Wonder, otherworldly

Pacing

Slow - 4-second hold before cut

Narration

In the dark, the ocean glows. Billions of microscopic organisms light the water from below - a phenomenon visible from space, yet invisible to most of humanity.

Visual Prompt (sent to model)

Night ocean surface covered in blue-green bioluminescent plankton, aerial view, low altitude, slow drift, cinematic wide shot, ultra-dark sky, stars reflected in water

Scene 317.1sHard cut to exterior

Setting

Deep-sea submersible interior, instrument panels illuminated

Camera

Medium close-up, slight handheld shake

Mood

Tense, claustrophobic

Pacing

Medium - cross-cut ready

Narration

At 4,000 meters, the pressure outside exceeds 400 atmospheres. The walls of the submersible flex slightly - a fact the engineers designed for, and the crew tries not to think about.

Visual Prompt (sent to model)

Interior of deep-sea research submersible, instrument panels glowing amber, cramped space, two silhouetted crew members, slight camera shake, cinematic documentary style, tense atmosphere

Scene 6711.8sL-cut into closing narration

Setting

Surface ship deck at dawn, crew watching footage playback

Camera

Over-shoulder tracking, pull back to wide

Mood

Triumphant, reflective

Pacing

Slow - denouement tempo

Narration

They had gone deeper than most humans will ever go, and found something that rewrites what we know about where life can exist - and by extension, where we might one day look for it beyond this planet.

Visual Prompt (sent to model)

Ship deck at dawn, golden hour light, research crew gathered around a monitor showing deep-sea footage, over-shoulder composition pulls back to reveal open ocean, hopeful mood, cinematic

3 of 72 scenes shown. The full script includes all 72, ordered by narrative arc, with complete production metadata for each.

Formatted Output

What the engine actually produces

Each scene is output as structured data - not prose. This allows the downstream pipeline to parse and act on every field without ambiguity.

Generated screenplay excerpt · Scene 31 of 72

INT. DEEP-SEA SUBMERSIBLE - INSTRUMENT PANEL - CONTINUOUS

SETTING

Interior of deep-sea research submersible at 4,000m depth. Instrument panels illuminate two silhouetted crew members. Claustrophobic, tense.

NARRATION

“At 4,000 meters, the pressure outside exceeds 400 atmospheres. The walls of the submersible flex slightly - a fact the engineers designed for, and the crew tries not to think about.”

Duration: 7.1s·Camera: MCU, slight handheld·Mood: tense, claustrophobic·Transition: hard cut·Model: Kling 3.0

72 scenes generated · Arc: hook → rising action → revelation → denouement

FAQ

Common questions

How good are the scripts Onira's Script Engine produces?

Scripts are structured at a professional documentary level - 60 to 80 scenes with narrative arc, camera angles, mood cues, and narration. They are not polished final drafts in every case, but they provide a complete, coherent production brief that would take a human screenwriter hours to produce. You can edit any scene before generating video.

Can I customise the script after it's generated?

Yes. Every scene, narration line, camera angle, mood note, and duration is editable before you proceed to video generation. You can also regenerate individual scenes without restarting from scratch.

What topics can the Script Engine handle?

Any topic that lends itself to documentary storytelling: history, science, nature, technology, biography, culture, finance, geopolitics, true crime, and more. The engine adapts the narrative approach to the subject matter.

How long does script generation take?

A full 60-80 scene script typically generates in 60-90 seconds using Gemini 2.5 Pro as the underlying model. The engine researches the topic, builds the narrative arc, and writes every scene in a single pass.

Does the Script Engine work for shorter videos too?

Yes. You specify the target duration and the engine scales scene density accordingly - from 15-20 scenes for a 3-minute explainer to 80+ scenes for a 15-minute deep-dive. The four-act narrative structure adapts to any length.

Let the engine write your script

Type your topic. Get a 60-80 scene structured documentary screenplay in under 90 seconds - with camera angles, mood notes, and visual prompts for every scene.

From $79/mo · Cancel anytime