Lecture Video Maker

GUIDES AND TUTORIALS

Tutorial videos show the main features and their usages. Guides explain the advanced features including the duration control of slides. Other detailed topics are handled in Q&A section.

TUTORIALS

The following YouTube videos explain how to use the main features of Lecture Video Maker.

GUIDES

Open the Sidebar

Click on the "Start" command to open the sidebar.

Sidebar Components

The sidebar comprises three major components.

  • The command buttons provide access to the main features, such as voice recording and video generation, of the Add-On.

  • The estimated duration shows the estimated length of the video to be generated. Be careful that it is a rough estimation.

  • The slide list shows the list of slide titles, indications of recorded voices and embedded resources, and each slide's estimated corresponding video durations.

Speech Synthesis

Click on the ear-mark icon on the sidebar to synthesize the speech from speaker notes.

  • The speech synthesis uses Google Cloud Text-to-Speech in the background. You need to supply an API key for that access. (See Tutorial #3.)

  • After the synthesis complete, the slide list updates, and the ear-mark icon appears.

  • The synthesized speech voice overwrite the recorded voice. If you want to keep the recorded voice while testing the speech synthesis, select the text in the speaker notes and then synthesize. The speech synthesis for the selected text is handled as just a tentative synthesis.

  • You can configure the synthesis parameters, such as the voice, pitch, and speed, at the "Speech Synthesis" tab on the "Settings" form.

  • Google Cloud Text-to-Speech denies too long texts. Although the limit is 5,000 characters, it is not the length of the text you typed in the speaker notes. The Add-On adds many special sequences (such as silence) to the text before sending it to Google Cloud Text-to-Speech. A rule of thumb is to keep your typed text's length less than 2,500 characters.

  • Avoid using special characters such as < or > within the text. More precisely, the text is treated as a portion of SSML code. Therefore, you can embed SSML elements such as <break time="200ms" />, but it is illegal to write <broken</>tag> within the text.

  • During the synthesis, the Add-On accepts the special notation for text replacements. For example, if you want the phrase "USA" to be pronounced as "United States", write it as "[US|United States]". The Add-On will replace this bracketed phrase with "United States" before the synthesis.

Voice Recording

Click on the microphone icon on the sidebar to record your voice.

  • Before starting the recording, it counts down for three. During this counting down, you can stop the recording by clicking on the count. (See Tutorial #2.)

  • After the recording complete, the slide list updates, and the microphone icon appears.

  • You can record up to five minutes per slide.

Voice Removal

Click on the cross icon on the sidebar to clear the recorded or synthesized voice on the current slide.

  • Before clearing the voice, the icon changes to the encircled cross and waits for three seconds. During that wait, you can abort the removal by clicking on the button. (See Tutorial #2.)

Voice Reloading

Click on the synchronize icon on the sidebar to load the recorded or synthesized voice on the current slide.

  • You can play the loaded voice by clicking on the play button on the media player on the sidebar.

  • Due to technical limitations, when you change the current slide to another, the loaded voice on the media player won't change automatically. That is why you need this reloading functionality.

Video Generation

Click on the download icon on the sidebar to generate the video.

  • The video generation will take several minutes. The actual time in need depends on various factors. (See Tutorial #1.)

  • After completing the video generation, it will be downloaded onto your local computer automatically.

  • You can configure the quality of the generated video at the "Video" tabs on the "Settings" form. (See Tutorial #1.)

Sound Embedding

Click on the musical note icon on the sidebar to embed sounds into the presentation.

  • The "Add a sound" form appears. You can choose the sound (.mp3, .ogg, .oga, .wav) to be embedded from the list. Those sound files are located in the "resources" folder, which resides in the same folder of the presentation. (See Tutorial #4.)

  • When you choose the sound and click on the "Add the chosen sound" button, a special command like "!sound filename.mp3" will be added to the speaker notes. (See Tutorial #4.)

  • Also, the slide list updates, and the musical-note icon appears.

Movie Embedding

Click on the boxed triangle icon on the sidebar to embed movies into the presentation.

  • This functionality works almost the same as the sound embedding.

  • The "Add a movie" form appears. You can choose the movie (.mp4, .mp2, .webm, .avi) to be embedded from the list. Those movie files are located in the "resources" folder that resides in the presentation's same folder.

  • When you choose the movie and click on the "Add the chosen movie" button, a special command like "!movie filename.mp4" will be added to the speaker notes.

  • Also, the slide list updates, and the boxed triangle icon appears.

  • The duration of a slide with embedded movies is set to be the total duration of those movies.

Crossfade

Click on the mixed-pictures icon on the sidebar to choose the crossfade effect applied to the slide transition.

  • The "Choose xfade" form appears. You can choose the crossfade effect applied to the transition from the current slide to the next slide. (See Tutorial #4.)

  • When you choose the effect and click on the "Add the chosen xfade" button, a special command like "!xfade fade" will be added to the speaker notes.

  • Also, the slide list updates, and the mixed-pictures icon appears.

  • Note that you can use only one crossfade effect per page. If you put multiple xfade commands, only the last one is processed, and others will be ignored.

Text-to-Speech Voice Change

Click on the face icon on the sidebar to choose the speech synthesis voice specific to the current slide.

  • The "Change text-to-speech voice" form appears. You can choose the synthesis voice applied only to the current slide.

  • When you choose the effect and click on the "Use the chosen voice" button, a special command like "!config {...}" will be added to the speaker notes.

  • Note that you can use change the voice only once per page. If you put multiple config commands, only the last one is processed, and others will be ignored.

Settings Configuration

Click on the gear icon on the sidebar to show the "Settings" form.

On the "Settings" form, you can configure the following parameters.


SPEECH SYNTHESIS

  • pitch: The pitch of the synthesized voice. The larger value means the higher tone.

  • speed: The speed of the synthesized speech. The larger value means faster speech.

  • voice: The voice used for the synthesis. You can overwrite this setting by "!config" command on each page.

SILENCE

  • For page without audio: The Add-On automatically adds some duration of silence during the video generation. This parameter determines the duration of the silence added to the page with blank speaker notes. The other following parameters work similarly.

  • At page start: The duration of the silence at the start of each page. This silence will be removed when the page has an embedded sound.

  • At blank line: The duration of the silence at each blank line in speaker notes. This silence does not apply to the head and the tail blanks.

  • At line end: The duration of the silence added to the end of each line in speaker notes.

SOUND

  • sample rate: The sample rate of the audio in the generated video.

  • bit rate: The bit rate of the audio in the generated video.

  • channels: The channels count of the audio in the generated video.

VIDEO

  • quality: The quality of the generated video. The less value means the better quality. (See Tutorial #1.)

  • frame rate: The frame rate of the generated video. This frame rate also applies to the embedded videos.

  • height: The height of the generated video.

  • xfade duration: The duration of the crossfade effect applied to transitions.

BGM

  • repetition: The repetition count of BGM play during the video generation.

  • loudness weight: The weight of the BGM loudness. The larger value means the louder BGM.

  • fading out from: The start time of the fading out of the BGM. You can choose "No fade-out" by moving the slider to the leftmost.

  • fading out duration: The duration of the fading out of the BGM.


You can save the configured parameters in the following two ways.


  • Save to this presentation: Save the parameters to the current presentation only. Other presentations won't be affected.

  • Save as default: Save the parameters as the default settings. It affects the presentations you will create. It also affects the existing presentations using the default settings. Note that this option clears the settings saved on the current presentation.


If you want to restore the default settings at the installed time, click on the "Reset" button.

Resource Files Export

Click on the "Export resource files to Drive" command to export resource files, including the snapshot pictures of slides, the texts in the speaker notes, and the recorded or synthesized speech voices. The exported files will be saved on the specially created export folder on your Google Drive. It could take several minutes to complete.

Secrets Settings

Click on the "Set app secrets" command to open the "Secrets" form. In the form, you can confirm and update the API key for Google Cloud Text-to-Speech. (See Tutorial #3.)

Subscription Settings

Click on the "Check my subscription" command to open the "My Subscription" form. In the form, you can confirm and update your subscription. (See Tutorial #5.)

Q&A

I cannot install the Add-On. OR, I cannot show the sidebar.

One possible reason is the simultaneous use of multiple Google accounts. If you have multiple Google accounts and log in to them simultaneously, the installation or the command executions could fail. Please log in to only one account when you use the Add-On.

It stops with "saveAs is not defined" error.

That error could occur when your internet connection is unstable and failed load a file-saving library. In that case, close the sidebar and retry operations.

How can I add a BGM to my video?

Upload a sound file named "bgm.mp3" to the "resources" folder. Then, the Add-On detects it automatically and mixes it with the video.

Can I add several BGMs to one video?

No. The BGM mixing feature of the Add-On is very limited. It's intended use is to add a BGM as a short prelude to the lecture.

How many pages can the Add-On process?

There is no explicit limit. It depends on the capacity of your web browser and computer. More pages require more time and resources to generate the video.

The video generation fails in mid-course. How can I solve it?

The video generation could fail in various reasons including, but not limited to:

  • Internet connection failures,

  • Shortage of your computer memory, and

  • Broken inputs such as invalid sound files for embedding.

Check out your internet connections and other applications working on your computer. Then, retry later.

UPDATE: If you see "Sad Chrome" when the video generation fails, close the sidebar and then retry.

Chrome's bug causes the "Sad Chrome" case. We updated Lecture Video Maker to cache the interim results to enable virtual resumes.

The sidebar sometimes grays out and becomes not operatable. Why?

The Add-On automatically checks and recalculates the duration of slides once in a minute. This periodic check inquires the last updated time of the presentation and invokes the recalculation only when the presentation is updated. This behavior causes several non-operatable periods, as you saw.

The icon appearing after recording my voice is not the microphone icon. Why?

It is due to the technical limitation of Google Drive. The recording and the update of the slide list cannot be completely synchronous. Occasionally, the update process cannot see the latest recorded data. It leads to improper icon displaying. However, the slide list is updated periodically and will be corrected eventually.

The speech synthesis fails. It says "Request contains an invalid argument."

One possible cause is the set-up failure of your API key. The API key supplied to the Add-On must have the access privilege to the Google Cloud Text-to-Speech. See Tutorial #3 and check the following points:

  1. Have you enabled the Google Cloud Text-to-Speech?

  2. Don't you restrict the API key privilege and exclude access to Google Cloud Text-to-Speech?

  3. Can you see the voice list on the Voice Chooser form? If not, the API key is not correctly set up.

If you still fail to synthesize the speech, please create a new support ticket and contact us.

Does the Add-On have the feature of desktop capture?

No. If you want to embed your desktop operation as part of the lecture video, please use other tools for that part. For your information, the operation parts in the tutorial videos are recorded by OBS Studio. It is free and open-source.