Jeff Foster – ProVideo Coalition

First Look: GoPro HERO13 Black + A New Smaller HERO

Jeff Foster — Wed, 04 Sep 2024 21:23:22 +0000

GoPro announced today the release of TWO new action cams: HERO13 Black and the new HERO – the smallest and most compact GoPro camera to date. With attention to customer requests to their flagship lineup, the HERO13 Black offers much more over the HERO12 Black. I’ll run down the new features and comparisons here – much gleaned from their provided reviewer’s guides. I also tested it briefly and will share my results, but look for a much deeper dive review once I’ve had a chance to really dig into the features and accessories available to me.

(From press release):

San Mateo, Calif., Sept. 4, 2024 – Today, GoPro (NASDAQ: GPRO) announced two new GoPro cameras: the top-of-the-line $399 HERO13 Black with its HB-Series Lenses, and the smallest, simplest 4K camera, the $199 HERO. Each camera offers unique new features that build upon GoPro-patented, customer-favorite technologies.

“This year’s two new GoPro cameras are inspired by and built for the GoPro Community,” says GoPro Founder and CEO Nicholas Woodman. “HERO13 Black features four new interchangeable lens mods that HERO13 automatically detects and adjusts its settings for, magnetic latch mounting, GPS, and a more powerful battery that provides longer runtimes and improved thermal performance. And then there’s HERO, the smallest, lightest 4K GoPro ever with a screen that will appeal to new and experienced users with its rugged, ultralight design and impressive 4K image quality.”

What’s New?

HERO13 Black

HERO13 Black is a powerful combination of leading-edge tech and endless creative possibilities. Beyond best-in-class 5.3K 60 frames per second video, Emmy® Award-winning HyperSmooth stabilization, and core GoPro characteristics, HERO13 Black features new:

Incredible 13x Burst Slo-Mo – Captures up to 400 frames per second at HD-quality 720p video, as well as 5.3K at 120 frames per second and 900p at 360 frames per second video.
Redesigned 10% Larger Capacity, More Power Efficient Enduro Battery – Extends runtimes in all weather conditions.
Snap and Go Magnetic Latch Mounting – Joins built-in mounting fingers and 1/4-20 mounting threads for three ways to mount HERO13 Black.
Faster WiFi 6 Technology – For up to 40% quicker content transfer speeds.
Professional-Level Hybrid Log Gamma (HLG) HDR Video – Broadcast-standard 10bit and Rec. 2100 color space and a wider color gamut than HDR alone.
More Customizable Audio Tuning – Choose balanced true-to-life sound or a Voice setting that enhances vocal clarity, while also preserving ambient background sounds.
GPS + Performance Stickers – Track speed, path, terrain, altitude and g-force, and help with geotagging in third party media management apps.
Plus, more custom preset options, improved QuikCapture and more.

HERO13 Black ($399.99) customers can choose between the standalone camera, the HERO13 Black Creator Edition ($599.99) complete with the Volta Power Grip, Media Mod and Light Mod, or activity-optimized HERO13 Black Accessory Bundles exclusive to GoPro.com. All are available for preorder today. Preorder shipping and global retail on-shelf availability will begin on September 10.

Runs cooler for longer

The HERO13 Black is designed to run cooler in both static environments and in motion, where utilizing their Airflow case and heat dispensation designs. This addresses one of the main complaints that have been voiced in the past few models that would overheat and shut off until the camera cooled down to be reset again.

Here are a some comparisons:

Redesigned Power Solutions

A 10% larger capacity 1900mAh Enduro battery combines with improved power efficiency and a redesigned battery enclosure to deliver longer runtimes in all conditions. HERO13 Black provides 1.5 hours of continuous recording in the highest resolution 4K30 and 5.3K30 settings, plus over 2.5 hours of continuous recording at HD-quality 1080p30.

NOTE: The new 1900mAh Enduro battery is unique to the HERO13 Black and not interchangeable with the previous Enduro batteries used in the HERO12 and 11.

Level up to all-weather power by adding the Contacto Magnetic Door and Power Cable Kit ($79.99). This power extension accessory magnetically attaches to a specially designed side door for HERO13 Black that allows for a quick, simple and weatherproof way to connect an external power source.

HB Series Lenses

HERO13 Black customers can level up their creative game by adding the innovative, new HERO Black (HB)-Series Lenses to their cart. Each lens is automatically detected by HERO13 Black to provide optimal settings options based on the lens type and the environment.

The four HB-Series Lenses are the only lenses compatible with GoPro-patented HyperSmooth stabilization and are waterproof and scratch resistant with hydrophobic coatings.

Ultra Wide Lens Mod ($99.99): Transforms HERO13 Black into the ultimate POV camera by capturing more in every shot with a 177° field of view and new 1:1 aspect ratio, giving you more field of view and the freedom to crop your footage to widescreen 16:9 or vertical 9:16 shots— no matter how the camera is mounted. And, this lens maximizes HyperSmooth performance for unbreakable 360° Horizon Lock in-camera video stabilization up to 4K60 resolution.
Macro Lens Mod ($129.99): Expand creative possibilities with variable focus on objects in the distance and objects that can be up to 4x closer than the standard GoPro lens. The variable focus ring lets you manually adjust the focus distance from as close as 4.3in (11cm).
Anamorphic Lens Mod ($129.99): Capture ultra wide-angle, artistic footage with less distortion than traditional wide-angle perspectives in a dramatic 21:9 aspect ratio—just like you see in feature films. Cinematic lens flares add to the character of your footage, while in-camera “de-squeezing” makes it easy to capture, review and edit anamorphic content without intensive post-production workflows.
ND Filter 4-Pack ($69.99): Easily create cinematic motion blur in your shots with HB-Series neutral density (ND) filters in ND4 / ND8 / ND16 / ND32. Simply attach a filter and your HERO13 Black automatically detects it, toggles into Auto Cinematic video mode, and dials in the best settings based on your environment.

Ultra Wide Lens Mod, Macro Lens Mod and the ND Filter 4-Pack are sold separately from HERO13 Black and are now available for preorder at GoPro.com. Preorder shipping and global retail on-shelf availability will begin on September 10. Anamorphic Lens Mod will be available in 2025.

Mounts & Accessories

New Magnetic Latch ($24.99) and Ball Joint Mounts ($39.99) introduce the quickest way to move HERO13 Black between various mounts. They are compatible with all existing GoPro mounts and use a quick dual-latch system to secure your camera. Simply pinch to release and swap mounts or flip your camera 180° to change up the perspective.

HERO13 Black customers can also mount their camera using built-in mounting fingers for the most secure, low-profile mounting option and 1/4-20 mounting threads to be used with tripods and standard professional camera accessories.

I’ll be showing more on these in my extensive Hands-On review at the end of the month.

Hands-on with the HERO13 Black

The one thing I really wanted to test for this first look review was the new Macro Lens and the Magnetic Latch mount. So I decided to just film some shots making dinner one night and cut them up like you might an home cook influencer video.

In this case, it’s “Mushroom Meatballs” where I use ground Turkey for the protein and 5 different kinds of mushrooms for the gravy. All shot in my kitchen with available overhead lighting – no studio lights or screens (which it really needed but I was hungry!)

I tried one shot in super slo-mo but the LED lights in my kitchen strobed poorly so I cut it down to 2X (50%) and shot in 4K.

I simply jump cut all the steps and ingredients together to simulate the process but these are all full res and retimed in some clips for the slo-mo effect, but straight out of the camera with no post processing (and no audio):

Look for more tests and comparison videos in my extensive Hands-On review at the end of the month.

New HERO

HERO is the smallest, lightest, simplest to use and lowest cost 4K camera ever with a screen. It features:

Ultra Compact Design, Weighing Only 86g: With built-in mounting fingers included, HERO has 35% less volume and 46% less mass than HERO13 Black.
Rugged + Waterproof to 16ft (5m): Completely waterproof and built with legendary GoPro durability, HERO is ready to capture the fun whether you’re ripping through mud, snow, water or just exploring a new city.
Intuitive Touch Screen + One-Button Control: Use HERO’s LCD screen to frame your shots perfectly and simply swipe or press the mode button to swap modes. When you’re ready, hit the shutter button to start capturing.
Stunning Image Quality + 2x Slo-Mo: Capture in Ultra HD 4K and HD 1080p video, 12MP photos or slow things down with 2.7K 60 frames per second. You can also grab 8MP frame grabs from your 4K videos using the Quik app.
16:9 Aspect Ratio: Delivers YouTube-optimized, horizontal video.
HyperSmooth Video Stabilization With the Quik App: The GoPro Quik app uses HyperSmooth video stabilization to automatically smooth out the bumps in your footage.
Long-Lasting Enduro Battery: HERO can record continuously for up to 100 minutes at its highest video setting on a single charge.**

HERO ($199.99) is available for preorder today. Preorder shipping and global retail on-shelf availability will begin on September 22. (I’ll be including the HERO in my deep dive review at the end of the month).

Quik Software

Both HERO13 Black and HERO are compatible with the Quik app to benefit from the following GoPro subscriber perks:

Highlight Videos Automatically Sent to Your Phone – Simply plug in your GoPro when connected to your home Wi-Fi. While it’s charging, your footage is automatically uploaded to the cloud and used to make a highlight video complete with beat-synced music and effects. Videos are automatically sent to your phone and ready to share.
Edit Your Shots with the Quik App – Tap into an array of easy-to-use tools that let you edit your footage like a pro. You can tweak the highlight videos created by the app or make your own videos from scratch. You can also zoom in, crop, add filters and data overlays, and more with your footage.
Easy Transferring + Unlimited Cloud Backup – Transferring photos and videos to your phone via the Quik app is a snap with wireless upload. There’s also unlimited cloud storage with hassle-free auto-upload. Just plug in your camera when connected to your home Wi-Fi and your GoPro does the rest.

GoPro customers can unlock the above with the Premium ($49.99/year) or Premium+ ($99.99/year) GoPro subscriptions, available in the Quik app or at GoPro.com.

The Comparisons

As always, I like to include all the technical specs and comparisons between new/previous models:

AI Tools: Are AI developers really keeping us safe?

Jeff Foster — Sun, 31 Mar 2024 02:46:26 +0000

With the latest developments in AI video software tools producing such real and believable results with cloning capabilities, we are now in a new age of disinformation and identify theft/IP rustling. What are some of the major players doing to attempt to protect the public from this threat – especially in such a volatile political/election year?

Well, there is one company that is taking it seriously…

HeyGen Cloned Video Avatars

Still the leader in AI Video avatar production, HeyGen made a sudden bold move to protect people’s identity that actually put a screeching halt to some producers and projects earlier this month. Average users that use the service to record their sampling videos directly through their laptops or devices felt no change, but video producers who have the Pro accounts ran into a bigger issue.

HeyGen’s published Privacy & Moderation Policies:

https://www.heygen.com/policy

https://www.heygen.com/moderation-policy

The workflow I have outlined in my tutorials and AI video workshops no longer worked. We could record a professional green screen video in the studio and have the talent record their authorization statement at the same time. Then we could upload the video sample and the recorded authorization statement to create the avatar.

But that option was suddenly removed without warning and we were stuck with recorded footage and no way to upload it to generate avatars for the project we’re working on.

So I reached out to the sales team to find out why and was told that I needed to obtain an Enterprise plan which starts out with a $10,000 buy in to get started.

I was able to get ahold of HeyGen’s senior management for comment and was contacted by their co-founder and CPO, Wayne Liang who shared the reason for the unannounced move on a Zoom call. I left the conversation feeling that they really made the right move, as they were actively averting groups of people trying to trick the system to avert proper authority in creating malicious and harmful avatar videos. Exactly what we are trying to avoid as working professionals.

At last word, they are revamping some of their user plans to be more aligned with their customers needs. At the time of this article is published, their plans are as follows:

Current Upload & Approval Process

What we have currently are two options for uploading/approving when shooting talent for a professional video for the avatar without having an Enterprise account.

1: You must be able to upload your video directly at the studio while the talent records their consent on a laptop logged in to your account.

2: You get your video footage with the talent and then color correct and clean up the plate before submitting – then you send your submission clip to the talent and let them log into your account and submit the video on a Zoom call.

I have successfully done both methods working remotely with the studios and with talent that was able to approve the video a couple weeks after the shoot.

In this example I followed #2 above and contacted the corporate talent via a Teams call. I was able to share the video clip through our Frame.io account so he could download it to his desktop and then upload again in HeyGen.

Obviously you have to give your talent access to your HeyGen account so you might place a temporary password on it until they’ve finished uploading.

There are a series of prompts you can walk them through on a Zoom call and then they have to read a script for authorization on camera which also has a random security code so it can’t be bypassed.

After going through this process I can see that this is indeed a proper safeguard to keep your image safe.

I’ve provided an edited video of my Teams call with the talent for this particular shoot so you can see how simple the process really is.

HeyGen’s Safety & Ethics page: https://www.heygen.com/ethics

AI Tools: OpenAI Reveals Sora and Shocks the AI Video Industry

Jeff Foster — Sat, 17 Feb 2024 23:06:47 +0000

On Thursday this week (Feb 15), Sam Altman, CEO of OpenAI (ChatGPT, DALL-E) released a sneak peek into our not-to-distant future of realistic AI generated text-to-video content with the announcement of their new model, Sora on a “Xitter” post:

here is sora, our video generation model:https://t.co/CDr4DdCrh1

today we are starting red-teaming and offering access to a limited number of creators.@_tim_brooks @billpeeb @model_mechanic are really incredible; amazing work by them and the team.

remarkable moment.

— Sam Altman (@sama) February 15, 2024

What is Sora?

*From OpenAI website –

*Creating video from text

Sora is an AI model that can create realistic and imaginative scenes from text instructions.

We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.

Introducing Sora, our text-to-video model. Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt.

Today, Sora is becoming available to red teamers to assess critical areas for harms or risks. We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.

We’re sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon.

Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.

The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.

The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.

The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.

Altman then teased the public by taking on live requests to generate AI videos up to a minute long with prompts from the audience:

we'd like to show you what sora can do, please reply with captions for videos you'd like to see and we'll start making some!

— Sam Altman (@sama) February 15, 2024

And some of the results were pretty amazing!

https://t.co/qbj02M4ng8 pic.twitter.com/EvngqF2ZIX

— Sam Altman (@sama) February 15, 2024

https://t.co/uCuhUPv51N pic.twitter.com/nej4TIwgaP

— Sam Altman (@sama) February 15, 2024

here is a better one: https://t.co/WJQCMEH9QG pic.twitter.com/oymtmHVmZN

— Sam Altman (@sama) February 15, 2024

https://t.co/rPqToLo6J3 pic.twitter.com/nPPH2bP6IZ

— Sam Altman (@sama) February 15, 2024

https://t.co/rmk9zI0oqO pic.twitter.com/WanFKOzdIw

— Sam Altman (@sama) February 15, 2024

Pretty impressive – and up to a minute in length!

How does Sora work?

*From the Sora website – BE SURE TO CLICK THE TECHNICAL REPORT LINK!

*Research techniques

Sora is a diffusion model, which generates a video by starting off with one that looks like static noise and gradually transforms it by removing the noise over many steps.

Sora is capable of generating entire videos all at once or extending generated videos to make them longer. By giving the model foresight of many frames at a time, we’ve solved a challenging problem of making sure a subject stays the same even when it goes out of view temporarily.

Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.

We represent videos and images as collections of smaller units of data called patches, each of which is akin to a token in GPT. By unifying how we represent data, we can train diffusion transformers on a wider range of visual data than was possible before, spanning different durations, resolutions and aspect ratios.

Sora builds on past research in DALL·E and GPT models. It uses the recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data. As a result, the model is able to follow the user’s text instructions in the generated video more faithfully.

In addition to being able to generate a video solely from text instructions, the model is able to take an existing still image and generate a video from it, animating the image’s contents with accuracy and attention to small detail. The model can also take an existing video and extend it or fill in missing frames. Learn more in our technical report.

Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.

Initial Opinions and Overviews

Since this is making quite a buzz around social media already, and none of us mere citizens have access to the tech yet, there’s no sense in reinventing the wheel with a video overview and rundown when there are already some good tech vloggers out there on top of it!

Is it too good? Should we be concerned?

As others mentioned in the above overview videos, the first markets that will be directly affected by the latest generative AI models such as Sora will be stock photography and stock video used for short clips of B-roll in general video productions. We’re already seeing AI generated video and animation clips being used in marketing, but the real fear is something AI generated being passed off as “real” – like in journalism, campaign ads, etc. – hence the safety warnings and processes to attempt to protect from that.

Of course the tech writers are all discussing and addressing public concerns before this model is released for the public to play with. It appears that OpenAI is consciously trying to get ahead of the issues and potential problems generated with the technology:

*From OpenAI website –

*Safety

We’ll be taking several important safety steps ahead of making Sora available in OpenAI’s products. We are working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model.

We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora. We plan to include C2PA metadata in the future if we deploy the model in an OpenAI product.

In addition to us developing new techniques to prepare for deployment, we’re leveraging the existing safety methods that we built for our products that use DALL·E 3, which are applicable to Sora as well.

For example, once in an OpenAI product, our text classifier will check and reject text input prompts that are in violation of our usage policies, like those that request extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others. We’ve also developed robust image classifiers that are used to review the frames of every video generated to help ensure that it adheres to our usage policies, before it’s shown to the user.

We’ll be engaging policymakers, educators and artists around the world to understand their concerns and to identify positive use cases for this new technology. Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it. That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time.

Some comparisons with other Generative AI tools…

Is it all in the prompts? Are there shared LLM models somewhere on the backend?

For kicks, I tried a few myself with Runway AI using the exact same prompts in this short video test (It fails miserably on all counts!)

Now that Runway has been called-out, we’ll see how they end up rising to the challenge!

Nick St. Pierre on X has discovered some strange similarities with results from Midjourney with the same text prompts. Click through to see his results:

I ran all of the Sora prompts through Midjourney

Interesting how similar some are

side-by-sides against vids:

— Nick St. Pierre (@nickfloats) February 16, 2024

Some resulting renders were eerily similar – such as the woman’s dress below.

A grandmother with neatly combed grey hair stands behind a colorful birthday cake with numerous candles at a wood dining room table, expression is one of pure joy and happiness, with a happy glow in her eye. She leans forward and blows out the candles with a gentle puff, the… pic.twitter.com/MBxlJdTRCG

— Nick St. Pierre (@nickfloats) February 16, 2024

What’s next?

Obviously, 2024 is off to an amazing start with Generative AI Tools, and of course we’ll be on a close watch with Sora and all the competition rising to the challenge. We are literally just two days from the announcement and there’s still so much to learn and test, but we all know how this industry is changing by the minute.

Excerpted from Leslie Katz’ article on Forbes yesterday:

Generative AI tools are, of course, generating a range of responses, from excitement about creative possibilities to anger about possibly copyright infringement and fear about the impact on the livelihood of those in creative industries—and on creativity itself. Sora is no different.

“Hollywood is about to implode and go thermonuclear,” one X user wrote in response to Sora’s arrival.

OpenAI said it needs to complete safety checks before making Sora publicly available. Experts in areas like misinformation, hateful content and bias will be “adversarially” testing the model, the company said in a blog post.

“Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it,” OpenAI said. “That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time.”

In the meantime, we’re waiting to see the benchmarks raised

Will smith eating spaghetti.

This is the video to beat, let's see what sora can do. pic.twitter.com/tJgynMKRmY

— Jeff Kirdeikis (@JeffKirdeikis) February 15, 2024

AI Tools: Ultimate AI Upscaling with Magnific AI

Jeff Foster — Sun, 28 Jan 2024 21:05:46 +0000

While there are some amazing upscaling tools already on the market to clean up and restore blurry or low-res images, Magnific AI takes a different approach to Generative AI upscaling by allowing you to dive deep into an image and create amazing unseen details at high magnification. While using it simply to upscale a photograph for a tighter crop (which it does amazing well) you can also create a world of fantasy with a little imagination and manipulation in the editing.

Magnific AI

https://magnific.ai/

Rather than providing a simple up-res of your image and filling in details with AI, Magnific allows you to modify the results with more input with text-to-image results and controls to alter the results.

Once your upscaled images are generated, the settings are shown in the upper right so you can regenerate other images similarly or make adjustment to your image settings and re-render.

This is a demo video that Magnific.ai posted on their Instagram Account (captured and edited to remove commenters identities)

But all this “magic” comes at a cost – Magnific AI isn’t necessarily a budget upscaler. I’m currently just checking it out with the Magnific Pro monthly plan for a couple months to determine if it’s worth upgrading or save a few $$ by paying up front for an annual plan (you get two months free). I do think the Pro plan would be sufficient for more independent content creators and producers providing about 300 images a month, so it’s really reasonable if you put it to work.

Magnific AI Testing

Most of the tests and examples shared in the forums are of Generative AI or CGI images that get upscaled, but how well does it work on vintage photos?

I gave it a try with this old image found in the public domain of the Fuller Flatiron building in New York City from the early 1900s. It’s a really low-quality and fuzzy image that I though was up for the task.

I then upscaled it with minimal changes in “creativity” and resemblance up a couple of times to achieve 4x clarity. This is the scale of each upres compared to the original image above.

I’m sharing the following series of images to reveal the level of details generated by Magnific in each upres. Some sections of the image work obviously better than others.

As you can see, the noise is reduced in the 4x version but everything is flattened out drastically and making it appear more graphic than a photograph at scale. I would have to play with the slider settings more and possibly process different crops separate and composite back together.

I also might think that taking the more detailed images cropped out to say focus on a horse and wagon, you might be able to regenerate and stich them back together in Photoshop. Very time consuming but potentially doable.

Moving on to AI Generated images out of Midjourney v6, I tried a couple different examples to see what the results may be like.

In this first example, I generated an image with Midjourney v6 and then upscaled the details it to about 1600x through various crops and renders.

(click on image below to view full-res version)

Midjourney v6 generated image upscaled with Magnific AI

*Note that there are limitations to the rendered file size in Magnific AI so you need to crop and re-upload each instance into the area of detail you wish to zoom at these extreme magnifications. In these examples, I zooms into the eyes of my subjects.

Here’s a quick screen capture of the imaging process:

I created another image similarly, starting with a descriptive prompt in Midjourney v6 for my initial image, then I upscaled it to about 800x before adding yet another Midjourney v6 image into the reflection of the subject’s eye.

(click on image below to view full-res version)

Animating the Results

I imported all of my image renders into AfterEffects as layers and adjusted to generate a smooth magnification zoom into the eye details. Here’s the result:

AI Tools Part 5: 2023 A Year in Review

Jeff Foster — Fri, 29 Dec 2023 07:45:39 +0000

What a year it’s been for Generative AI technology and the tools that have been developed! When I first started the AI Tools series with Part 1 back in January of 2023, I could only see a sliver of what was to become in the months ahead. I then decided we need a central place that we could reference all these emerging tools and their updates so I created AI Tools: The List You Need Now in February. By the time I got to AI Tools Part 2 in March, it was apparent that developers were moving full-steam ahead in developing and providing updates and new tools every week. I had to pull back from doing such rapid updates that were quickly becoming obsolete as soon as I’d publish and put out AI Tools Part 3: in June, to give us a sense of where we were mid-year. Then focusing primarily on video and animation tools, I published AI Tools Part 4 in August, along with a smattering of how-to articles and tutorials for various AI Tools (which I’ll cover in this wrap up) that brings us to here at the end of the year.

And here we stand, looking into the possibilities of our future.

Midjourney 5.2 Generated center image and outcropped in Adobe Photoshop with Generative Fill Expand.

Where we started… Where we are are going.

The biggest fears about Generative AI in January, was that it was cheating; theft; it’s not real art; it’s going to take our jobs away; it’s going to steal your soul, etc. Well, only some of that has happened so far. I’ve seen many of my nay-saying design, imaging and photography colleagues that were actually angry about it then, embrace the capabilities and adapted their own style in how they use it in their own workflows and compositions. Some have made their work a lot more interesting and creative as a result! I always applaud them and encourage them in the advancements, because they’re trying something new and the results are beautiful.

And I see how many folks are jumping on the bandwagon with full abandon of their past limitations and genuinely having a lot of fun and have unleashed the creative beast. I think I’ve fallen into this category as I’m always interested to see what I can do with the next tool that opens up a portal to creativity. I’m an explorer and I get bored really quickly.

In reality, nobody has lost their souls to AI yet and the jury is still out as to whether any real theft or plagiarism has cost anyone money in actual losses for IP, though there have been some class action suits filed*. But where there have been actual loss of jobs and threats of loss of IP are in the entertainment industry – where voices, faces and likenesses can be cloned or synthesized and the actors are no longer needed for certain roles. We’ve witnessed the strikes against the big studios for unfair use/threats of use of people’s likeness for use in other productions without compensation.

*Note that as of December 27, 2023, the first major lawsuit against OpenAI and Microsoft was filed by the New York Tim es for copyright infringement.

As an independent producer, VFX artist and corporate/industrial video guy, I have seen both sides of this coin in a very real and very scary way. Because I can do these things with a desktop computer and a web browser – today. We can clone someone’s voice, their likeness, motion, etc. and create video avatars or translate what they’re saying into another language or make them say things they probably wouldn’t (which has a long list of ethical issues which we will undoubtedly experience in the coming election year, sadly.)

We are using synthesized voices for most of our Voiceover productions now, so we’re not hiring outside talent (or recording ourselves) anymore for general how-to videos. This gives us ultimate flexibility and consistency throughout a production or series, and can instantly fix mispronunciations, or change the inflection of the artist’s speech to match the project’s needs. It’s not perfect and can’t really replace the human element with most dramatic reads that require emotion and energy (yet) but for industrial and light commercial work, it does the trick. And I can’t help but feel a little guilty for not hiring voice actors now.

For cloning/video avatar work, we independent producers MUST take the initiative to protect the rights of the actors with whom we hire for projects. We are striving for fair compensation for their performance and a buy-out of projected use with strict limitations – just like commercial works. And they agree to participating on-camera and in written contract before we can even engage. We need the talent to give us content needed to clone and produce realistic results, but we’re also not a big studio that’s going to make a 3D virtual actor they can use for anything they want. If there’s a wardrobe change or pose, etc. then it’s a new shoot and a new agreement. There are still limitations as to what we can do with current technology, but there will be a day soon where these limitations will be lifted even at the prosumer level. I’m not sure what that even looks like right now…

All we can do is stay alert, be honest and ethical and fair, and try to navigate these fast and crazy waters we’ve entered like the digital pioneers that we are. These are tools and some tools can kill you if mishandled, so let’s try not to lose a limb out there!

The AI Tool roadmap to here…

Let’s look back at this past year and track the development of some of these AI Tools and technologies.

Starting with the original inspiration that got me hooked, was Text-to-Image tools. I’ve been using Midjourney since June of 2022 and it has evolved an insane amount since then. We’re currently at version 6.0 (Alpha)

Since I wanted to keep it a fair test all along, I used the same text prompt and only varied the version that Midjourney was running at the time. It’s a silly prompt the first time I wrote it back in June of 2022, but then we were lucky if we only got 2 eyes on every face that it output, so we tried everything crazy that popped in our heads! (well, I still do!)

Text prompt: Uma Thurman making a sandwich Tarantino style

I really have no idea what kind of sandwich Quinten prefers and none of these ended up with a Kill Bill vibe, but you’ll notice that the 4-image cluster produced in 6/22 had a much smaller resolution output than subsequent versions. In the 4th Quadrant, in 12/23 was done with v5.2 with the same text prompt. (Check out the 4X Upscaling with v6.0 directly below @ 4096×4096)

This is the upscaled image at full resolution (4096×4096) straight download from Midjourney in Discord with no retouching or further enhancements, nor were there any other prompts to provide details, lighting, textures, etc. – just the original prompt upscaled 4x. (Cropped detail below if you don’t want to download the full 4K image to view at 100%)

The difference with Midjourney v6.0 Alpha

Using the exact same text prompt gave me a very different result without any other prompts or settings changes. The results were often quite different (many don’t look like the subject at all) and they have a painterly style by default. But the biggest thing is, the AI understands that she’s actually MAKING a sandwich – not just holding or eating a sandwich. I think this is a big step for the text-to-image generator, and while I upscaled the one that did look most like Uma, I didn’t try to change any parameters or prompts to make it more photorealistic or anything; I’m still quite pleased with the results!

We’ve all seen the numerous demos and posts about Adobe Photoshop’s Generative Fill AI (powered by Firefly) and I’ve shared examples using it with video in a couple of my articles and in my workshops. It’s really become a useful tool for designers and image editors to extend scenes to “zoom out” or fit a design profile – like these examples from my AI Tools Part 4 article in August:

(For demonstration purposes only. All rights to the film examples are property of the studios that hold rights to them.)

Of course there are numerous ways to just have fun with it too! Check out some of the work that Russell Brown from Adobe has created with Generative Fill on his Instagram channel. Russ does really creative composites with a professional result – much of which he does on a mobile device.

For the featured image in this Year in Review article, I used the same Midjourney prompt for the image in my original AI Tools Part 1 article a year ago and then expanded the image with Adobe Photoshop’s Generative Fill to enhance the outer part of this “world”. The tools can really just work well together and that allows for more creativity and flexibility in your design work.

And of course there’s been great other advancements in AI image enhancement and generative AI tool development this year, including updates for Remini AI, Topaz Labs and a newcomer, Magnific that’s making some waves in the forums.

Magnific is a combination of an enhancement tool and a generative AI creator – but starts with an image to enhance, along with additional text prompting and adjustments in the tool’s interface.

Since I just gained access to the tool, I thought I’d start with a Midjourney image that we could zoom into by using Magnific. I used a very simple prompt to get this lovely AI generated starfish on the beach.

I then add it to Magnific and used the same text from my Midjourney prompt to double the upscaling while adding more detail. (note that currently the maximum upscaling is limited to 2x with a resulting file resolution of 4K).

That means you will need to download your rendered results and re-enter them for further upscaling until you max out, then crop an image into the area you want to get more details and then upload and render that. Repeat until you get to a result you like. I’m sure we’re going to be literally going down a rabbit hole as we experiment with this tool in the coming weeks, so stay tuned!

But while image enhancement and detail generation are powerful tools, the creativity is already looming online with content creators and designers to generate stunning simulated extreme “zooms”. For instance, check out this post from Dogan Ural that not only showcases this amazing zoom in video from his renders, but he explains the steps he took to create it in the thread as well.

Magnificent 128x zoom

NanoLand: Day 05
What is real?

Sound on

Prompts and settings are in the thread pic.twitter.com/kH69BIlqB8

— Dogan Ural (@doganuraldesign) December 22, 2023

That’s kind of a reverse process that I created for my Zoom Out animation using Midjourney and After Effects in my full article AI Tools: Animations with Midjourney & After Effects earlier this summer. I’m looking forward to experimenting with this new process as well!

Audio tools

There have been some advancements in audio tools as well. Take Adobe Podcast for instance.

When it was first released as a beta it was just a drag/drop your audio and hope it helped clean it up (and it usually did pretty well). But now not only does it have the Enhance Speech tool, but also a good Mic check tool that will determine if your setup is good enough quality to record your voice over. The Studio allows you to record, edit and enhance your audio right in your browser and has tools for transcription and pre-edited music beds.

A surprising recent discovery was Moises.ai, a series of AI tools developed for musicians for your desktop computer, web browser or mobile apps. It has several features I’ve yet to explore fully, such as Voice Studio, Lyric Writer, Audio Mastering and Track Separation.

With the Track Separation feature, you can upload a recorded song and specify how you want the AI to break it down into individual tracks, such as vocals, bass, drums, guitar, strings, etc. It does a pretty remarkable job that lets you isolate and control the volume of the different tracks so you can learn your guitar riffs or sing along with the vocals remover.

And for fun, you can use Suno.ai where you can generate a short song with just a text description. In this example, I simply wrote “Bouncy pop song about computers” and it generated two different examples, including the lyrics in just seconds.

Here’s a link to the first song it generated (links to a web page)

And here’s the second song (with lyrics show below):

I’ve covered a lot about ElevenLabs ai in several of my articles, and how it has been part of our production workflows for how to videos and marketing shorts on social media. I’ve even used it in conjunction with my video avatars covered below.

But there are new ai tools that are up and coming to challenge them with more features besides cloning and synthesis, such as adjusting for a range of emotions and varying the delivery of the text. One such tool is PlayHT. You can start with 100s of synthesized voices and apply various emotions to the read, or clone your own voices and utilize the tool the same way.

Video & Animation tools

I’ve been mostly interested in this area of development as you can see from some of my other AI Tools articles, including AI Tools Part 4 where I shared workflows and technology for video and animation production back in August.

I’ve been experimenting more with available updates to various AI software tools, such as HeyGen, which I featured in an article and tutorial on the production workflow for producing AI Avatars from your video and cloned voice.

Since then, I’ve been working at further developing the process and have been producing AI Avatar videos for various high-end tech clients (I can’t divulge who here) but I did create this fun project that utilized 100% Generative AI for the cloned voice, the video avatars and all the background images/animations. It’s tongue in cheek and probably offensive to many, but it’s gained attention so an effective marketing piece!

On that note, there are already business models lining up to utilize this technology for commercial applications, such as this model for a personalized news channel: https://www.channel1.ai/ They’re mixing AI Avatars for the news anchors and reporters and feeding collected reels to stream stories to your region and interests.

Another tool that’s been making great strides in video production is Runway AI. I’ve featured it in past articles, but the tools and workflows for producing some creative content have been shared around social media and the community is really getting creative with it.

I featured a how-to article/tutorial on how to make a “AI World Jump Videos” like this one:

I’ve also demonstrated several of these tools and techniques for virtual conferences I’ve taught at such as the Design + AI Summit from CreativePro earlier this month. Here was a teaser I produced for the session on LinkedIn:

https://www.provideocoalition.com/wp-content/uploads/Design-AI-Summit-teaser.mp4

You will be seeing much more utilizing this incredible technology in the coming months. I’m really excited about its development and projects I’ll be working on.

New Technological Developments

So what’s next?

I won’t be continuing these category-laden industry articles into the new year, so expect to see more shorter, individual AI Tools articles and tutorials, which will include updates and most likely, project workflows. I also won’t be continuing to update the massive “AI Tools The List” as it’s nearly a full-time job keeping up, plus there are so many “lists” out there from various tech portals that it’s all becoming redundant. I may do some kind of smaller list for reference or whatever, but we’ll have to see what the industry does, as things change literally daily.

What I WILL commit to is to bring you exciting new tech as it happens and I can share it as I discover it to. The best way to find up to the minute announcements and shared white papers is to follow my on LinkedIn account.

AI Tools: Runway AI World Jump Videos

Jeff Foster — Wed, 01 Nov 2023 00:02:38 +0000

Runway AI is one of the fastest growing AI Tools for video and animation creators. Their suite of tools expands beyond their generative AI offerings, but one trend that’s taking off recently is using their Gen 1 Video to Video tool to create AI videos that are totally different from the input video that’s fed into them. One such technique is the “world jump” simulation, where the main subject appears to “jump” into another dimension and become a totally different character but using the same motions as the original. Think back to the TV series “Quantum Leap”.

You can use any NLE to edit your video, but the key is aligning the clips precisely so the video has a smooth transition from scene to scene. Sound design really helps sell the effect as well.

Examples

The video that took the Internet by storm – from Martin Haerlin. He made this concept popular as you’ll see in the following examples, but this original one was just 4 months ago:

Here’s another video Martin created with more developed characters and action at each stage:

Another fun production in the mirror with several characters voiced in:

And most recently, his entry into the Runway Gen:48 short film competition with a mock insurance ad:

The Process

It’s such a clever and popular effect – enough that Runway has a tutorial on their site to show how it’s done generally.

So, being inspired by this technique, I thought I’d give it a try myself…

Here’s my Tutorial video – with more detailed steps below:

Process breakdown

First, record your spin/walking/static selfie video (or shoot your subject close up). It’s best to have some sort of action to define the scene “jumps”, whether it’s a finger snap, a motion or some kind of signal as shown in the above examples.

Then slice up your footage in Adobe Premiere Pro and export each segment as a separate video clip.

Next, log into your Runway AI account and chose Gen 1 Video to Video

Select your first clip sequence that you rendered our of Premiere and upload it to Runway AI.

Add reference images to your library or select some default styles for your various sequences that will be applied to yoru original clips. In this example, I used an image I created in Runway AI text to image generator.

The settings are important to note in the Advanced Tab on the right. As explained in the Runway AI help page, use these as a guideline to dial in the desired effect and adjust from there with each setting.

Once you are happy with a preview look, then select Generate Video on the thumbnail to render that segment. Repeat this process through all the clip segments that you created from your original video, and download the rendered results.

In Adobe Premiere, assemble all your clips on the timeline and make sure they match end to end.

Since I used a fictional “device” to remote from one dimension to another, I wanted some kind of cosmic “blast” effect, so I created my own in Premiere using several SFX from the Adobe Stock library, stacked until I got the result I was happy with.

I also wanted to add some kind of flash to help make the “jump” to the next dimension, so I settled on Boris FX BBC Lens Flash for each transition.

I then used several different ambient sounds from my library to create varying environments for each segment, which really helps to sell the idea that you are in a different world or dimension.

Stay tuned for more creative exploration with Generative AI Tools!

AI Tools: HeyGen AI Video Avatars & Translations

Jeff Foster — Thu, 28 Sep 2023 23:56:49 +0000

HeyGen AI software is a ground-breaking AI technology, bringing video avatars and translations/dubbing to the average prosumer production. If you need a talking head video for marketing, training and how-to videos, then this is the place to start! Aside from all the other features, options, templates for generating content from pre-made avatars and images, the real juice with this software is the ability to clone yourself from a video & audio clip is truly incredible.

As I’ve been saying in all my AI Tools updates: AI Tool technology is advancing at such a rate that we’re measuring it by days now, not even weeks, months or years. The developers at HeyGen are a perfect example of such rapid development, that I’ve had to change this product review several times in the past month; because either their advancements in technology, approach to developing these avatars, and their pricing structure of everything has changed almost daily.

It really took off when I saw a video posted on LinkedIn from the CEO of HeyGen teasing the capabilities of their new “Avatar Lite” beta, which I promptly got on board and applied to start testing – and I got this response in email the same day. The rest is a very, very short history!

While I’ll outline several features from this AI Tool, the biggest focus for me is on the Video Avatars – which have been evolving rapidly as I stated above. For instance, I made this video about a week ago and it’s already outdated in the features, quality and naming of the various tools. The “Avatar Lite” used to take 3-5 business days to generate a usable video avatar as shown (mostly with hands-on techs refining the process), but now is currently automated to generate an “Instant Avatar” in mere minutes! You can also now “Finetune” your Instant Avatar (for an additional fee of $59/mo – which we’ll discuss later in this article) and comes back to you in under 24 hrs.

Here’s a completely AI-generated video showing the process – including the voice translations feature mentioned below from only a couple weeks ago:

So what all does HeyGen do?

Video Productions from Templates

Depending on your skill level and requirements, there are many ways to start generating content in HeyGen’s studio. You can select one of dozens of starter templates you can modify directly in the portal and even switch your provided avatars from their library. The user interface is really straightforward and easy to navigate and make adjustments.

There are dozens of pre-loaded avatars and voices available to choose from.

You can also just make a video with an avatar on a green background and composite them directly in your NLE of choice. This example was from a generic avatar and voice generated in HeyGen and then composited in After Effects for an example social media short video.

Animated Faces from Photos & Images

This was the first step in discovering how much fun this software could be. I discovered it a few months ago and played around with various photos and images rendered from Midjourney. The process has changed a little since then, but the quality has improved a great deal.

They also have an option to generate AI characters with a text description directly inside HeyGen’s interface. It came up with some interesting results but note that only the faces/heads get animated and not the entire torso when you generate avatars this way.

It’s a quick process – simply upload your image and apply an AI voice (or cloned voice from your ElevenLabs API) and then create a video with your text input. Just upload your photo or rendered image to start (making sure the face is full visible and central to yoru image).

Here’s a few examples from my headshot photo and a couple Midjourney images:

Check out the example below where I’m setting up the green screen studio and our studio mannequin “Leana” complains. That was done from an iPhone photo in HeyGen using this same process.

Video Avatars & Voice Cloning

This is where we split off from the rest of the pack – and what got me excited about using HeyGen for regular marketing and instructional purposes at my day gig at a biotech company. It really has generated a lot of interest with our product marketing folks.

The first step is to make sure you have a good video and audio recording to work from. You can just put up a tripod and shoot yourself or your subject in a casual or business environment with a steady background and clean audio for your submission. You shouldn’t move around or make sudden gestures or facial expressions and let the video run for a full uninterrupted 2-5 minutes for the best cloning results.

When you create your avatar, you have to submit a video authorization (from the subject directly) for security purposes. This keeps the site safe from nefarious activities.

In this first video I generated from my home studio office was a baseline to build my other experiments on:

For more flexibility in my avatars, I set-up the green screen studio to shoot more tests of myself, reading the same 2-1/2 minute script from a teleprompter for my comparisons. Setting up the greenscreen after a few years since the first shutdown for Covid took awhile to dial everything in, so our mannequin “Leana” got a bit impatient standing there all day. (also animated with HeyGen)

The process is really simple and I don’t need to outline all the steps here because it’s easy to follow their instructions from the web site and they have multiple video tutorials on the site they’ve created. You can use either a prerecorded voice audio file or TTS using a built-in voice or select a clone you’ve generated. I’ve downloaded several from ElevenLabs to generate many of my test videos but now prefer using the built-in third-party API to generate directly inside of HeyGen and I can access whatever ElevenLabs voices I have in my account through the voice manager.

So for this example, I applied one AI generated VO audio from ElevenLabs.com to create three different versions of the same script to see how they compared – or differed from each other. Keep in mind that these three avatars aren’t just dressed differently, they were sourced from three separate videos that I shot on the green screen at separate times. Applying the prerecorded AI voice from ElevenLabs assured the Avatars would sync properly. I could not get this same result had I run the ElevenLabs API to generate the VO on the fly repeatedly as there would be variations in the voices.

In this example, I ran the same script and composite in Premiere just exchanging the green screen composites from After Effects in the same sequence.

Instant Avatar vs “Finetuned”

The Instant Avatars you get with your plan are adequate for most purposes (you can purchase more if needed), but the Finetuned Avatars do have better mouth and lip sync performance, as seen in my testing.

In this example video, I used ElevenLabs to produce the audio track which I uploaded to HeyGen when I created the video avatars, so they have the exact same audio track for true side-by-side comparison. Notice the accuracy of the lip sync is improved on the Finetuned version on the right.

Better yet – I’ve found that using the ElevenLabs API link directly inside of HeyGen, I get much better lip syncing and mouth movements on BOTH the Instant and Finetuned avatars.

This is only the beginning… watch this tech closely in the coming months!

Translations & Dubbing

There are two ways of generating translations in HeyGen. One is to input translated text into the video avatar producer and select a multilingual voice from your ElevenLabs API and create a clean video avatar from there.

The other method lets you upload any video clip with a subject running at least 30 seconds facing the camera and it will generate a new video for you with a clone of the actor’s voice and lip syncing capabilities automatically in a few minutes. Here’s their rundown on the process in video form from the HeyGen website:

I’ve tested several video clips and the results have been amazing! Check out the intro video at the top of the article to see more examples I’ve created.

Here’s an example clip that I created from a scene from Pulp Fiction with Christopher Walken and translated into Spanish and French. You can see where this could be really helpful for video dubbing and regionalizations in the future.

Pros & Cons

While I have been a major fanboy the past month or so over these new features and capabilities, I’d be remiss to not point out some things that I hope get resolved or updated in future versions of the HeyGen software tools – and pay structures.

The tools are evolving quickly – to the point that I think most of this review will be obsolete by year-end. And with that, possibly positioned to be bought up by a bigger brand or another round of financing encourages the developers to make a leap toward world domination. (only slightly kidding)

I would like to see the ability to control the Instant Avatars more with gestures, facial expression, etc. When the voices get more energetic, the faces should reflect that as well. Mabe just an “exaggeration/enhancement” slider or something.

The talking photos could use more control as well – like the way the puppet tool works in After Effects, where you can define the points that move or at least define the boundaries of the head/hair so the whole head moves – not just the face.

And pricing seems to be all over the place currently – but that might be due to the changes in product offerings as they develop. For instance, the $99/yr for a voice clone that I feel is sub-par to what you can generate in ElevenLabs. (which I’m really thankful for the application of the ElevenLabs API which produces the best of both worlds in one easy step). The monthly fee for the base service is fair, especially when 3 Instant Avatars are included with the $59 Creator package. The “Finetuned” option is an additional $59/mo for EACH AVATAR you upgrade this option for. That means if you upgrade all three Instant Avatars you create, that’s an additional $150 mo just to continue to use them. I guess if you don’t need them any more, just cancel the upgrade plan for each one, but I’m not really seeing that much value in the little bit of difference that the “Finetuning” provides at this point for most customers – but professionals will justify the additional cost to get a better level result.

Hands-on with the GoPro HERO12 Black

Jeff Foster — Sat, 09 Sep 2023 22:19:43 +0000

It’s that time of year again, when GoPro has been releasing an updated camera model annually for the last few years, almost like clockwork, and this year it’s the GoPro HERO12 Black. So what have they been up to since their launch of the HERO11 Black a year ago? A lot. But since I only had the camera in-hand for a few days before carving out time to put it to use, I’m going to just give an overview in this review as it will take additional time to really dig down deep to do a comparison review which I hope to do next month (still in plenty of time for those on the fence for your holiday shopping decisions).

At first glance out of the box, you’ll notice the rubberized casing on the HERO12 Black body has a light blue speckled finish – which I thought at first was manufacturing dust (lol) but after some use, it does help hide a lot of the surface dirt and scuffs that accumulate over time. Plus it sets it apart from the previous HERO cameras right away. Otherwise, the form factor is the same and it fits all the same GoPro accessories and mods that work with the HERO11 Black – including the Enduro batteries.

So, lets get right into a quick look at the key features/specs. The GoPro Hero 12 Black has several new features, including:

HDR: High dynamic range video and photos
Improved video stabilization: HyperSmooth 6.0 with AutoBoost automatically enhances video stabilization
Wireless audio: Supports Apple AirPods and other Bluetooth headphones
Doubled battery life: In certain modes, the battery life is doubled
Versatile aspect ratio: The 1/1.9” sensor has an 8:7 aspect ratio that can be cropped into different aspect ratios
360 degree Horizon Lock
Multiple Bluetooth connections
Timecode Sync
Mounting thread: The 1/4-20 mounting thread allows the camera to be attached to most camera tripods without an adapter

The Hero 12 Black also has 5.3K and 4K HDR video. It has the same maximum resolution as the Hero 11 Black at 5.3K at 60 frames per second. Launched alongside it is the Max Lens Mod 2.0 for the Hero 12, with an ultra-wide 177° FOV.

First off, Hypersmooth 6.0 with AutoBoost really lives up to the hype. It’s not just a marketing gimmick, it makes even the novice user look pro right out of the box. AutoBoost allows the sensor to capture the maximum amount of stabilized image data when there’s relatively smooth terrain and motion, so you have less cropping of your image area until it’s needed to really stabilize radical motion. There’s really not much more to say other than look at my hands-on videos below and see for yourself!

The Bluetooth Wireless Audio Support feature was a non-starter for me as it only seems to work with Apple AirPods and some Bluetooth earbuds. I couldn’t get a connection with my 3rd party earbuds or any other wireless devices to date, but I’ll dig into that deeper in my next article. For the examples I’ve heard from other YouTube reviewers, it’s more akin to a cell phone audio connection anyway so it’s practicality would be only for voice commands to the GoPro and possibly syncing to another audio recorder for production.

Now THIS part was exciting for me personally and professionally, as I tend to use the GoPros as auxiliary cameras on a multicam shoot or put them in places where I run them a long time, instead of the typical +/-5min shots that the average action cam user may do.

ALSO – less chance of overheating and shutting off that I often complained about in the previous few versions of the HERO cams. GoPro removed the built-in GPS function and I believe this is attributed to longer runtimes and cooler operations. Since I never once used the GPS tagging features, I’m sure there are customers who might rely on them, so perhaps in future versions they may make it a selectable option that can be turned on/off at the cost of shorter operations.

I can’t wait to fully play with the night modes more before the cloudier skies of Fall come upon us, and I will be sharing those in my in-depth follow-up review shortly. They appear to be very accessible and straightforward to use.

Max Lens Mod 2.0

The Max Lens Mod 2.0 is a combination of a physical lens that you swap out with the original HERO12 lens and is automatically detected by the camera to switch to Max mode. The default settings I used in my first tests were 4K 16:9 30p and MSV (Max Super View) and Max HyperSmooth. You can push it up to 60p in 4K, and a vertical 9:16 ratio or 4:3 with Max HyperView. If you shoot at 1080 you can go to 120p, but can no longer get the widest Max Hyperview.

Here’s how the three different view crop settings in-camera compare.

Professional Production Upgrades

While many features and improvements in the HERO12 Black enhance the overall appeal to professional users, such as HyperSmooth 6.0 stabilization, longer battery runtimes, multiple Bluetooth connections and the versatile 8:7 sensor, I feel these improvements help push it over the edge in real-world applications.

Timecode Sync

The GoPro Hero 12 Black has a feature called timecode sync that allows you to wirelessly sync an unlimited number of Hero 12 Black cameras. This makes it easier to edit footage from multiple angles and ensure they match up. Simply show the QR code from the Quik app to each HERO12. The time and date will sync between all the HERO12s you’re shooting with. Timecode sync works with Final Cut Pro, Adobe Premiere Pro, and other leading editing apps.

Return of Screen Monitoring while Recording in the Quik App

This may seem like a minor feature but for those of us who want to monitor the camera’s POV while recording (within Bluetooth range) to make sure you’re lining up your shots and capturing what you hoped, it’s invaluable. And we’re glad they’ve brought it back to the Quik app!

Built-in 1/4-20 Threaded Mount

This may seem like a ho-hum option for some, but for most of us that use their GoPros in production environments, we’re always scrambling to find those damned adapters to get our cams mounted, at the cost of the additional height and more chances for slippage or spin-offs under stress. I’m really pleased to finally have this option over resorting to a gaff tape workaround!

GP LOG encoding and HDR 10bit video

I’m excited to test this feature out in my next in-depth review to see how the footage can be more closely matched with other professional cameras in post – which makes the HERO12 Black more accessible as an option for indie filmmakers and professional videographers. GoPro provides their basic LUT with the camera (GPLOG_Auto_WB.cube)

But here’s the exciting part – seeing all of these features together with an action-packed GoPro HERO12 Black launch video that highlights each in detail:

You really need the GoPro Quik app to setup and update your HERO12 Black, and quite frankly, I don’t use 90% of the app’s capabilities – yet. But having the on-screen monitoring while recording back, I’m finding that I am using more features as I engage with the device.

You can read up more on the GoPro website to get further details and examples. For the detail-oriented folks who just want to see the specs, I’m putting the 10-panel comparison charts down below my hands-on videos for more information on what has changed from the HERO11 Black – so you don’t have to keep scrolling forever.

Hands-On Testing

The reviewer’s kit I received from GoPro included the Max Lens Mod 2.0 and all my tests this past weekend were made in the simplest, easy settings mode right out of the box. I didn’t alter anything (yet) but just select a couple defaults to see how it performed for the average lifestyle video user. I also only used the raw videos right out of the camera with no color correction or stabilization in my tests, so WYSIWYG – and with the default Protune in HDR, is really impressive. And so is Hypersmooth 6.0 as you’ll see in my test videos below.

For this first video compilation I just took the HERO12 Black out of the box and charged it up, set up it with my updated Quik app on my iPhone and started my Saturday of the Labor Day weekend, doing tasks and chores around our small farm here in the Northern California Sierra Foothills. So much of our lives here are built around our trees, plants, our horse Mysty and of course Halona the PD (Production Dog). I literally clipped the GoPro to anything and everything as I went about my day – cutting out a lot more than you’ll see here, so these are only some of the highlights. This was all shot in 5.3k/60p but I exported the edited version to 4k/60p for YouTube uploading. No other processing of the raw footage was applied.

I took a bit of time-out for some RC Truck fun with my TRAXXIS test vehicle to test the Max Lens Mod and put the HyperSmooth 6.0 stabilization to the test. You can see it really held on the horizon leveling even with jarring movements and jumps in rough terrain. Shot at 4k/30p in HDR.

It’s been 12 years since our first GoPro dog cam video with the HERO2 featuring our beloved Halona the PD (Production Dog) who just turned 15, so it seemed only fitting to let her run around the farm with the HERO12 Black and the Mx Lens Mod to get a dog’s POV of the world. What I was most impressed with beyond the stabilization of the GoPro merely clipped to a dog’s collar, but how amazing the footage looks straight out of the camera! Shot in 4K/30p in HDR, the transitions from exterior to interior, setting sunlight and shadows, the HERO12 Black really adjusted beautifully.

Comparison Charts between HERO12 Black and HERO11 Black

Coming in the near future:

To help streamline your professional editing process, GoPro is finally developing a great way to ingest, archive and do basic editing of your footage from the cloud to your desktop (requires GoPro.com subscription). This is for the rest of us who don’t necessarily like to edit short clips on our iPhones and resort to pulling the SD card after every shoot just to wrangle our data. This could be a real game changer for professional productions, and I’ll be checking it out once the Windows version is available next spring.

Obviously, I’ve only scratched the surface with the capabilities of the HERO12 Black in my hands, so look for a subsequent in-depth review shortly. In the meantime, check out the GoPro subscription – where content and control of your media is centralized with protected cloud storage. I hope to cover more about this workflow as well.

AI Tools Part 4: The Latest for Video, Film Production, 3D, Audio and more…

Jeff Foster — Mon, 28 Aug 2023 15:40:20 +0000

Here we are at the end of August 2023. So much is happening so fast in Generative AI development, that I obviously can’t keep up with everything, but I will do my best to give you the latest updates on the most impressive AI Tools & techniques I’ve had a chance to take a good look at. Starting with this fun example made with ElevenLabs AI TTS and a HeyGen AI animated avatar template:

While the animated avatar is “fun” it obviously still has a ways to go to serve any real application for video or films purposes, but see the first segment below and get a peek into the near future of just how realistic they’re going to be!

But let’s focus on that cloned voice recording for a minute! I’ll go into more detail about cloning in the AI Generative Audio segment in this article, but just to show you that this was completely AI generated, here’s a screenshot of that script:

Exciting times for independent media creators, trainers and marketers to be empowered with these tools to generate compelling multimedia content!

I’m not going to add a lot of hypothetical fluff and opinion this time – you can get plenty of that on social media and continue the ethics debates, etc. on your own (including who’s on strike now and how AI is going to take away jobs, etc.). I’m just here to show you some cool AI Tools and their capabilities for prosumer/content creators.

So in case this is your first time reading one of my AI Tools articles, go back in time and read these as well and see just how far we’ve come in only 8 months this year!

AI Tools Part 1: Why We Need Them

AI Tools Part 2: A Deeper Dive

AI Tools Part 3: The Current State of Generative AI Tools

And ALWAYS keep an eye on the UPDATED AI TOOLS: THE LIST YOU NEED NOW!

AI Tools Categories:

Generative AI Text to Video

In this segment we’re going to look at one of the leading AI tools that can produce video or animated content simply from text. Yes, there are several who claim to be the best (just ask them) but from what I’ve seen to date, one stands out above them all. And the best part is if you have some editing/mograph production skills, then you can do much more than just use a standard templates, as I outline below. And it’s only going to get much better very soon.

HeyGen (formerly Movio.ai)
https://app.heygen.com/home

I’ve seen several AI text to video generators over the past several months and frankly, most of them seem rather silly, robotic and downright creepy – straight out of the Uncanny Valley. And granted, some of the basic default avatars and templates in HeyGen can take on that same look and feel as well. For now, that is.

I tried a couple of their templates to test the capabilities and results, and what I found might be okay for some generic marketing and messaging on one’s customer facing website if they’re a small business or service. And that definitely serves the needs of some businesses on a budget. But I found their TTS engine wasn’t all that great and the robotic effect of the avatar is only more distracting by the robotic voice. I then ran the same script through ElevenLabs AI to produce a much clearer and human-like voice and applied it to the template as well and the results are more acceptable.

In this following example, you’ll hear the HeyGen voice on the first pass, then the ElevenLabs synthesized voice on the second pass.

Of course, that was the same process I used in my opening example video at the top of the article, but I used my own cloned voice in ElevenLabs to produce the audio.

To take this production process a step further, I created a short demo for our marketing team at my day gig (a biotech company) and included one step at the beginning – using ChatGPT to give me a 30-second VO script on the topic of Genome Sequencing written for an academic audience. I then ran through the process rendering the avatar on a green background and then editing with text/graphics in Premiere.

The result was encouraging for potential use on social media to get quick informative info out to potential customers and educators. The total time from concept to completion was just a few hours – but could be much faster if you used templates in Premiere for commonly-used graphics, music, etc.

But just wait until you see what’s in development now…

The next wave of AI generated short videos is going to be much harder to tell they aren’t real. Check out this teaser from the co-founder and CEO of HeyGen, Joshua Xu, as he demos this technology in this quick video:

So yes, I’ve applied for the beta and will be sharing my results in another article soon! In the meantime, I was sent this auto-generated personalized welcome video the HeyGen team – which serves as yet another great use for these videos. This is starting to get really interesting!

Stay tuned…

Generative AI Outpainting

Since Adobe launched their Photoshop Beta with Firefly AI, there have been a lot of cool experimental projects shared through social media – and some tips are pretty useful, as I’ve shown in previous AI Tools articles – but we’ll look at one particularly useful workflow here. In addition, Midjourney AI has been adding new features to its v5.2 such as panning, zoom out and now vary (region).

Expanding the scene from Lawrence of Arabia with Generative AI

Adobe Photoshop

Adobe Photoshop Beta’s AI Generative Fill allows you to extend backgrounds for upscaling, reformatting and set extensions in production for locked-off shots. (You can also use it for pans and dolly shots if you can motion track the “frame” to the footage, but avoid anything with motion out at the edges or parallax in the shot.

For my examples here, I simply grabbed a single frame from a piece of footage and open it up in Photoshop Beta, then set the Canvas size larger than the original. I then made a selection just inside of the live image area and inverted it to select a Generative Fill to paint out the outside areas.

Photoshop Beta typically produces three variations of the AI generated content to choose from, so make sure you look at each one to be find the one you like the best. In this example, I liked the hill the best of the choices offered but want to get ride of the two trees as they’re distracting from the original shot. That’s easy with Generative Fill “Inpainting”.

Just like selecting areas around an image frame to Outpaint extensions, you can select objects or areas within the newly-generated image to remove them as well. Just select with a marquee tool and Generate.

The resulting image looks complete. Simply save the generated “frame” as a PNG file and use it in your editor to expand the shot.

I really wish I had this technology the past couple years while working on a feature film that I needed to create extensions for parts of certain shots. I can only imagine it’s going to get better in time – and especially when they can produce Generative Fill in After Effects.

Here are a few examples of famous shots from various movies you may recognize, showing the results of generated expansion in 2K.
(For demonstration purposes only. All rights to the film examples are property of the studios that hold rights to them.)

Midjourney (Zoom-out)

In case you missed my full article on this new feature from Midjourney AI , go check it out now and read more on how I created this animation using 20 layers of rendered “outpainted” Zoom-out frames, animated in After Effects.

There are also two new features added since for pan/extend up/down/left/right and Vary (Region) for doing inpainting of rendered elements. I’ll cover those in more detail in an upcoming article.

Generative AI 3D

You’re already familiar with traditional 3D modeling, animation and rendering – and we have a great tool to share for that. But we’re also talking about NeRFs here.

What is a NeRF and how does it work?

A neural radiance field (NeRF) is a fully-connected neural network that can generate novel views of complex 3D scenes, based on a partial set of 2D images. It is trained to use a rendering loss to reproduce input views of a scene. It works by taking input images representing a scene and interpolating between them to render one complete scene. NeRF is a highly effective way to generate images for synthetic data. (Excerpted from the Datagen website)

A NeRF network is trained to map directly from viewing direction and spatial location (5D input) to opacity and color (4D output), using volume rendering to render new views. NeRF is a computationally-intensive algorithm, and processing of complex scenes can take hours or days. However, new algorithms are available that dramatically improve performance.

This is a great video that explains it and shows some good examples of how this tech is progressing:

Luma AI
https://lumalabs.ai/

Luma AI uses your camera to capture imagery from 360 degree angles to generate a NeRF render. Using your iPhone, you can download the iOS app and try it out for yourself.

I tested the app while waiting for the water to boil for my morning cup of coffee in our kitchen and was pleasantly surprised at how quick and easy it was from just an iPhone! Here’s a clip of the different views from the render:

View the NeRF render here and click on the various modes in the bottom right of the screen: https://lumalabs.ai/capture/96368DE7-DF4D-4B4B-87EB-5B85D5BDEA37?mode=lf

It doesn’t always work out as planned, depending on the scale of your object, the environment you’re capturing in and things like harsh lighting/reflections/shadows/transparency. Also working on a level surface is more advantageous than me trying to capture my car on our steep driveway in front of the house on our farm.

View the NeRF render yourself here to see what I mean – the car actually falls apart and there’s a section where it jumped in position during the mesh generation: https://lumalabs.ai/capture/8AE1BF7D-5368-427F-8CC4-D02B1B887D31?mode=lf

For some really nice results from power users, check out this link to Luma’s Featured Captures page.

Flythroughs (LumaLabs)
https://lumalabs.ai/flythroughs

To simplify generating a NeRF flythrough video, LumaLabs has created a new dedicated app that automates the process. It just came out and I’ve not had a chance to really test it yet, but you can learn more about it – the technology and several results from their website. It’s pretty cool and I can only imagine how great it’s going to be in the near future!

https://cdn-luma.com/5c6196b6bcd940e4f6eb2d940c8987f5a45ff18e2dfc694ca4f1585421fac83c/Rooftop_rain_with_background.mp4

Video to 3D API (LumaLabs)
https://lumalabs.ai/luma-api

Luma’s NeRF and meshing models are now available on our API, giving developers access to world’s best 3D modeling and reconstruction capabilities. At a dollar a scene or object. The API expects video walkthroughs of objects or scenes, looking outside in, from 2-3 levels. The output is an interactive 3D scene that can be embedded directly, coarse textured models to build interactions on in traditional 3D pipelines, and pre-rendered 360 images and videos.

For games, ecommerce and VFX – check out the use case examples here.

https://cdn-luma.com/public/lumalabs.ai/api/vfx.mp4

Spline AI
https://spline.design/ai

So what makes this AI based online 3D tool so different from any other 3D modeler/stager/designer/animator app? Well first of all, you don’t need any prior 3D experience – anyone can start creating with some very simple prompts/steps. Of course the more experience you have with other tools – even just Adobe Illustrator or Photoshop, the more intuitive working with it will be.

Oh, and not to mention – it’s FREE and works from any web browser!

You’re not going to be making complex 3D models and textures or anything cinematic with this tool, but for fun, quick and easy animations and interactive scenes and games, literally anyone can create in 3D with this AI tool.

This is a great overview of how Spline works and how you can get up to speed quickly with all of its capabilities, by the folks at School of Motion:

Be sure to check out all the tutorials on their YouTube channel as well – with step-by-step instructions for just about anything you can imagine wanting to do.

Here’s a great tutorial for getting started with the spline tool So intuitive and easy to use:

Generative AI Voiceover (TTS)

There are several AI TTS (Text To Speech) generators out on the market and some are built into other video and animation tools, but there is one that stands-out above all the rest at the moment – which is why I only focused on ElevenLabs AI.

ElevenLabs Multilingual v2

It’s amazing how fast this technology is advancing. In less than a year we’ve seen the results of early demos I shared in my AI Tools Part 1 article in January 2023, to now having so much more control over custom voices, cloning, accents and now even multiple languages!

With the announcement this week of Multilingual v2 out of beta, they offer these supported languages: Bulgarian, Classical Arabic, Chinese, Croatian, Czech, Danish, Dutch, English, Filipino, Finnish, French, German, Greek, Hindi, Indonesian, Japanese, Italian, Korean, Malay, Polish, Portuguese, Romanian, Slovak, Spanish, Swedish, Tamil, Turkish & Ukrainian (with many more in development) and join the previously available languages including English, Polish, German, Spanish, French, Italian, Hindi and Portuguese. Check out this demo of voices/languages:

Not only does the Multilingual v2 model provide different language voices, but it also makes your English voices sound much more realistic with emphasis and emotion that is a lot less robotic.

As I’ve mentioned in my previous AI Tools articles, we’re using ElevenLabs AI exclusively for all of our marketing How-to videos in my day gig at the biotech company. What I’m most impressed with is the correct pronunciation of technical and scientific terminology and even most acronyms. I’ve rarely had to phonetically spell out key words or phrases, though changing the synthesized voice parameters can change a lot of inflection and tone from casual to formal. But retakes/edits are a breeze when editing! Besides, some of the AI voices sound more human than some scientists anyway (j/k)

Here’s an example of a recent video published using AI for the VO.

Cloning for Possible use with ADR?

As you can hear in my opening demo video (well, those that know me and my voice that is), the Cloning feature in ElevenLabs is pretty amazing. Even dangerously so if used without permission. That’s why I’ve opted to NOT include an example from another source in this article, but only to point out the accuracy of the tone and natural phrasing it produces on cloned voices.

For film and video productions, this means you can clone the actor’s voice (with their permission of course) and do dialogue replacement for on-air word censoring, line replacements and even use the actor’s own voice to produce translations for dubbing!

I recorded a series of statements to train the AI and selected a few for this next video. Here is a comparison of my actual recorded voice compared to the cloned voice side-by-side. Can you guess which one is the recorded voice and which one is AI?

You can probably tell that the ones where I spoke a bit sloppy/lazy pronunciations were my original recorded voice, which makes me think this would be a better way to record my voice for tutorials and VO projects so I can always have a clear, understandable voice that’s consistent – regardless of my environment and how I’m feeling on any given day.

So to test the different languages, I used Google Translate to translate my short example script from English to various languages that are supported by ElevenLabs, plugged that into Multilingual v2 and was able to give myself several translations using my own cloned voice. So much potential for this AI technology – I sure wish I could actually speak these languages so fluently!

Again – this would be great for localization efforts when you need to translate your training and tutorial videos and still stay in your own voice!

JOIN ME at the AI Creative Summit – Sept 14-15

I’ll be presenting virtually at the AI Creative Summit on September 14-15, produced by FMC in partnership with NABshow NY.

I’ll be going into more detail on some of these technologies and workflows in my virtual sessions along with a stellar cast of my colleagues who are boldly tackling the world of Generative AI production!

AI Tools: Animations with Midjourney & After Effects

Jeff Foster — Mon, 31 Jul 2023 21:19:42 +0000

You’ve most probably seen a lot of AI generated animations already using various tools like Runway Gen-2 and Kaiber that take either still images and animate a few seconds or direct text-to-video generative style animations that tend to look somewhat like a Psilocybin-induced trip. While they’re interesting and often artistic in their nature as a genre unto itself, I’ve found little practical use for any of the technology to date – especially with anything even remotely realistic or smooth in their delivery.

While I’m not looking for extreme realism like I might in VFX work, I do want my fantasy creations to feel smooth and have quality motion and engaging elements that aren’t distracting to the overall work.

That’s why most AI content produced isn’t always ready straight out of the box. You often need to do further work on the elements in a rendered image in Photoshop – or if animating objects, After Effects. I do a lot of work in After Effects almost daily and 95% of it is working with 2D content and objects that simulate 3D environments and effects.

That’s where stumbling across the Zoom Out out-painting feature in Midjourney v5.2 somes in for some specific compositions. Midjourney is just a content generator, while AfterEffects is the animator.

Want to know how to use this new feature in Midjourney on Discord? This video tutorial for Midjourney Zoom-Out from All About AI’s YouTube channel walks you through the steps to produce them:

For this specific type of animation I’m looking to create, After Effects is necessary to “zoom” through multiple rendered layers from Midjourney to reveal a long track motion.

Initial Tests

So when it was announced that Midjourney v5.2 allowed for Zoom-out and Pan out-rendering capabilities, that triggered some fun experimentation in my quest for something unique.

I started with a bizarre concept that surprised me when it first rendered, as it wasn’t anything that I prompted Midjourney for but still gave a delightful result nonetheless. Since it was the first of July and the beginning of a 4-day weekend (as well as the first day of the month) I often do a fun take ont he ritual of saying “rabbit rabbit” on social media on the first of the month with an image from Midjourney. In this instance, I asked for “two rabbits eating hotdogs”, thinking I’d be able to add some fireworks or something colorful in subsequent renditions. The resulting image you see below in the first frame is one of the results I got and I was delighted at the choice of composition (even though it looks more like they’re eating an alien or squid).

When Midjourney v5.2 was announced days later, I just went back in Discord and looked back through my completed image in search for something to try out the new Zoom-Out feature. The next two renders were sequential and surprised me even more! First, the Zoom-out 1.5x gave me a nice outpainted scene around my initial feasting rabbits render. I then went 2x more and didn’t change the prompt at all – I got two more rabbits. I kept going 2x, 2x, 2x, 2x a bunch of times (13 to be exact) and it just created this long street scene lined with rabbits. They were really multiplying and varied in their characteristic and clothing and surroundings. They went from eating to drinking primarily but the variety of clothing and facial expressions was delightful!

After collecting all these renders and stacking them on top of each other in sequence in After Effects, I knew I needed to align the zooming in such a way that it always used the best quality “center focal area” of each layer, as I zoomed between 400%-50% before fading out and revealing the layer below aligned to take over the zoom. I used a feathered mask about 100 pixels inside the edge of each frame to eliminate the artifacts and differences of edge pixel data from each render.

I repeated the process all the way through 14 different layers just to see if it would work…

…and it kinda did!
My first test render in theory worked okay, but the factor of zooming between 400%-50% speed ramps in waves as you start the zoom cycle of the subsequent frame reveals, so it slows-down in the last 30% of the scaling path, then speeds up again in the next 30-50% or scaling motion. Here’s the first test result:

So in essence, each layer on the timeline looked like this:

What I needed was a way to normalize this motion to achieve the correct steady speed through the range of scale motion. After a little snooping I discovered a feature in After Effects that I’ve never used before! “Exponential Scale” found under the Animation drop-down menu:

So by selecting the first and last keyframes for scaling in the timeline, then applying this option, it provides a full Exponential Scale speed ramping that’s normalized straight-through.

Applying that to all the layers and making a few tweaks and adjustments for timing on a few layers, I rendered it again after Exponential Scale:

So while that was a satisfying discovery and a good practice and proof of concept piece, I set out to make something very deliberate and much more immersive.

The Dimensions of Nature Animation Project

What started out as an experiment ended up a collaborative project with my wife, Ellen Johnson, to make a short nature-inspired music video. I generated 20 different rendered frames (10 different “globes”) for a total zoom out factor of 1,048,576x. Other than using 3D animation software, there’s really no way to possibly create this with optics alone – especially at this resolution.

As I animated the layers for this project (that were rendered out of Midjourney v5.2), which zooms out from a snowglobe on a table to different miniature worlds of wonder each inside of another and eventually resting on a large glass jar on a carved stump. Ellen provided all the music scoring and sound design with SFX that travel through these various “worlds”, as I incorporated animated segments for each environment to bring them to life a bit.

In essence, we’re sort of creating an animated Russian Doll effect where you see one globe/jar environment inside another, inside another, etc.

Feathered Mattes ensure the highest quality images are always centered as we zoom out (downscaling)

The base zoom-out animation followed the same process as the rabbit animation earlier; stacking layers, masking, scaling, Exponential Scale applied to each layer, etc.

Once all 20 rendered layers were aligned and set to complete in approximately 3:40 (the proposed length of the music soundtrack), we determined what additional animation segments would make sense traveling through the various “worlds” – which transitioned through different types of weather and flora/fauna. This laid the groundwork for the music and sound design, as well as the detailed animations that got composited later.

Here’s the raw layers zoomed out and sped up 3X so you can see how smooth the scaling effect is before adding any other elements:

In this case, the Generative AI imagery was not only the inspiration for this animation, but also the base content that was animated, along with some other 2D elements such as particle generations, masked effects on motion layers and some green screen video objects for various insects, etc. from Adobe Stock. Over 100 layers in this project alone – before moving over to Adobe Premiere to edit in the recorded/mixed music track from Logic Pro, with another 12+ SFX tracks.

Green Screen animated insects (with reflections/shadows)

Wrapped water droplet layers and particles applied

Lightning bugs animated over time

The completed Music Video/Animation project is on YouTube – it’s best to watch it full-screen on YouTube directly than just embedded, so you can see the details. Also wearing earbuds/headphones are recommended if you don’t have a quality sound system when viewing.

Keep in mind, this is only 1080×1080 from the native original dimensions that Midjourney rendered the frames used and isn’t supposed to be truly realistic like a 3D render, but a fantasy exploration of imagery and sounds. Enjoy.