wy168 发表于 2022-8-28 06:49:41

AI让图片动起来,特朗普和蒙娜丽莎深情合唱《Unravel》

<div id="193a2427-46db-469b-8100-7d8021480dff" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/5accc511114b4c6bbfd7c32011d10d96~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=Bu73%2F1KEMwMvFYJ7Al1xGKhkTWg%3D" style="width:100%;"></div><h1 id="384d64d7-922d-43a6-922d-87340e6dd918" style="font-size:20px;margin:20px 0px;font-weight:700;">1前言</h1><p id="5d82130c-bc13-4a3c-a0a1-8b60c0d3c1c6" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">让一张图片,动起来,应该怎么做?</p><p id="734f6e8f-214a-4f32-b3e2-0185aae41a3f" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">DeepFake 一阶运动模型,让万物皆可动。</p><p id="df9c04b5-b3e7-4e54-8ebc-bf41d6b9a1fc" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">利用这项技术,用特朗普和蒙娜丽莎的图片,合唱一首《Unravel》,是什么效果?</p><p id="9d4319cc-539e-48f4-823f-61d27c6396f7" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">今天,它来了!</p><p id="e1e92541-16d2-4152-a273-64b4242640b5" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">今天,继续<strong id="29b5ded5-a45a-4aeb-b234-9af2ee53c9c7" style="font-size:18px;margin:20px 0px;font-weight:700;">手把手教学</strong>。</p><p id="1b17efca-4524-433a-b4c4-66071edac6ff" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">算法原理、环境搭建、效果实现,<strong id="3573f7b9-fc03-4815-9694-ac3a8f695065" style="font-size:18px;margin:20px 0px;font-weight:700;">一条龙服务</strong>,尽在下文!</p><h1 id="aae4a432-92c8-43ef-9d55-a87da7464a03" style="font-size:20px;margin:20px 0px;font-weight:700;">2算法原理</h1><p id="b884e6ef-d8ff-49f4-b317-48d50f4a4acf" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">First Order Motion,也就是一阶运动模型,来自 NeurIPS 2019 论文。</p><p id="f1328677-be05-43a8-8bf0-b4066b032279" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">「First Order Motion Model for Image Animation」</p><p id="4b685f22-fa04-478a-a41b-0c18a328e392" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">论文最初的目的是让「静态图片」动起来。如下图所示:你动,它也动。</p><div id="1b568cd0-1694-44a5-a820-20ee1b64e844" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/53ea2c834b7e4e7393858d44d123f013~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=ZqVMNj2vfHhBtU8ygDSCrb2rawU%3D" style="width:100%;"></div><p id="ea77e2c6-eb40-4cea-849a-e395dc94b246" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">这个模型可以轻易地让「权利的游戏」中的人物模仿特朗普进行讲话,还可以让静态的马跑起来等。</p><div id="472ef2e2-75c4-4c28-84ba-175feb916f13" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/a25658c2e7ac4118b75cb9cc803832d3~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=AjUZYhU7nEE0kR%2BuPo%2FH6zqiPJY%3D" style="width:100%;"></div><div id="d4d040f0-f163-4d39-9d95-295b0b43cdf9" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/0d6848ee8abf4f419df5aaec54d41721~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=YHuk6kehAxybLmDx5k2p5FTK%2FYU%3D" style="width:100%;"></div><p id="9034a9cc-f07a-4174-a2eb-dbf0cfa14f09" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">一阶运动模型的思想是用一组自学习的关键点和局部仿射变换来建立复杂运动模型。</p><p id="0769159c-ea81-4562-9313-693f59db4f84" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">模型由运动估计模块和图像生成模块两个主要部分组成。</p><div id="86d07cc7-812b-43ce-9f18-2930b9c6a361" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/8120090afe3547d19b2e000546fd2753~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=IC6NycdB%2BTrj2T661lUNpgLVde0%3D" style="width:100%;"></div><p id="847729a0-33c2-4f08-b249-282907d26d29" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="9b9ff490-4ce6-4a65-bf47-719091af2ef1" style="font-size:18px;margin:20px 0px;text-align:left;">首先进行关键点检测,然后根据关键点,进行运动估计,最后使用图像生成模块,生成最终效果。</span></p><p id="48f7b21b-9dca-4bb6-b4c8-cf8f9934533a" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="5e4ea73c-5f7c-4844-ac71-8157da3c05c5" style="font-size:18px;margin:20px 0px;text-align:left;">在运动估计模块中,该模型通过自监督学习将目标物体的外观和运动信息进行分离,并进行特征表示。</span></p><p id="18f95b6d-34fb-4c34-95df-47214183fd54" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="bb7eccf5-10d0-4132-8653-6859388cd798" style="font-size:18px;margin:20px 0px;text-align:left;">而在图像生成模块中,模型会对目标运动期间出现的遮挡进行建模,然后从给定的图片中提取外观信息,结合先前获得的特征表示,生成图片。</span></p><p id="7b92508f-1e7f-4093-9ffb-53a09387627f" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="4dd10e54-c425-4d13-a135-94565dc173d5" style="font-size:18px;margin:20px 0px;text-align:left;">作者使用该算法在四个数据集上进行了训练和测试。</span></p><p id="060b96fa-de56-46c7-b25e-b25d8f0b2306" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="f32631c4-7e71-4a60-960f-ff656c95cd27" style="font-size:18px;margin:20px 0px;text-align:left;">VoxCeleb 数据集、UvA-Nemo 数据集、The BAIR robot pushing dataset、作者自己收集的数据集。</span></p><p id="cf739fc3-40b8-44d7-91d9-f9d04ce2311f" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="99a2026c-fc1e-4c44-9475-86b52792a865" style="font-size:18px;margin:20px 0px;text-align:left;">其中,VoxCeleb 是一个大型人声识别数据集。</span></p><p id="ee8934d8-370d-48b3-9d61-79050165ed1b" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="2d89816d-9ca5-4bc4-8177-61111c920026" style="font-size:18px;margin:20px 0px;text-align:left;">它包含来自 YouTube 视频的 1251 位名人的约 10 万段语音,同时数据基本上是性别平衡的(男性占 55%),这些名人有不同的口音、职业和年龄。</span></p><div id="0e541902-9362-4c91-8407-c81b047460ab" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/a76baf7e195748c0b3aa8e27ff35d72f~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=qWiq79wtUDlUOUH0aVaqXNIbKaE%3D" style="width:100%;"></div><p id="f5a96c85-be4a-4d2e-9168-0481be74459a" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="0ca5e3ac-0524-4228-960c-7f96fbc5b01a" style="font-size:18px;margin:20px 0px;text-align:left;">First Order Motion 利用了这个数据集的视频图像,进行了模型训练。</span></p><p id="c897a735-2377-4137-9cf0-c50d29e7cab8" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="9eb0eeef-901d-455a-94b1-422ededa4575" style="font-size:18px;margin:20px 0px;text-align:left;">我们就可以利用这个训练好的,人脸的运动估计模型,完成我们今天的任务。</span></p><p id="36d9292c-f1aa-4dcc-b1dc-1b5ea27cc65b" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="4d634975-152d-41d7-9d2f-b04552142330" style="font-size:18px;margin:20px 0px;text-align:left;">「特朗普和蒙娜丽莎的深情合唱」。</span></p><p id="e3febda1-987e-464a-b32f-58b6bd14cc71" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="0c1f01a5-86a4-46b9-8cbc-f051676b6029" style="font-size:18px;margin:20px 0px;text-align:left;">除了需要用到这个一阶运动模型,还需要使用 OpenCV 和 ffmpeg 做视频、音频和图像的处理。</span></p><p id="4f4c5b2e-a41e-4cd6-9918-bbd754a82863" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="edd7f5b6-3f75-4935-bd0c-0321497f3e88" style="font-size:18px;margin:20px 0px;text-align:left;">具体的实现,在下文的「效果实现」中说明。</span></p><h1 id="61d6989d-ed07-40fb-90e7-2e0f37778061" style="font-size:20px;margin:20px 0px;font-weight:700;">3环境搭建</h1><p id="2f71d4ab-bc6f-45cb-b344-8a8eb08e173c" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">效果实现上,我们可以直接用已有的库去实现我们想要的功能。</p><p id="fac696b6-ac9f-4a4d-82b7-c4c39a3d09c1" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">「Real Time Image Animation」</p><p id="cd62b146-a814-43b3-932a-5b1512e1dfae" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><strong id="db091a39-5b45-453b-aa32-5977034ea3ee" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="67d2f0ec-a5d3-41a8-8c43-08f1e00f5f7d" style="font-size:18px;margin:20px 0px;text-align:left;">项目地址:私信333,即可给你分享!!!</span></strong></p><p id="aff1bae8-0d20-4888-af14-305e04964835" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="f41b726a-f035-4f40-91f7-53c083ffb7ad" style="font-size:18px;margin:20px 0px;text-align:left;">Python 为什么这么受欢迎,</span><strong id="2fc2529e-e29e-458b-b359-e83d84aa6034" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="581c69f7-8c88-4075-9c2f-4cbaa136fb03" style="font-size:18px;margin:20px 0px;text-align:left;">就是因为这一点</span></strong><span id="e6a3a2cb-0529-4fc0-85ac-928d80c3a635" style="font-size:18px;margin:20px 0px;text-align:left;">。</span></p><p id="060bdc34-f88e-4f5c-83dc-9c7a0162c212" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="c747b9bc-ee0f-4d79-9091-ea418ec9c323" style="font-size:18px;margin:20px 0px;text-align:left;">有很多开源项目,方便我们快速实现自己想要的功能,极大降低了开发成本。</span></p><p id="d10bec2d-43e0-4add-9b3c-6d5d735c5224" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="d8893a2e-ec74-4427-9633-17a655a34b20" style="font-size:18px;margin:20px 0px;text-align:left;">真是,谁用谁知道啊。</span></p><p id="7f4897ec-301d-4ebe-890f-7a3cb025ca51" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="922dd99b-2221-4c1e-8954-cd70bc83957b" style="font-size:18px;margin:20px 0px;text-align:left;">环境搭建,还是建议使用 Anaconda,安装一些必要的第三方库,可以参考这篇开发环境搭建的内容:</span></p><p id="1672548f-1df0-4bc5-8e04-f17f83399843" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">《Pytorch深度学习实战教程(一):语义分割基础与环境搭建》</p><p id="7bf306f1-78dd-4a6b-8037-6719f220219e" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">这个项目需要用到的第三方库,也都写的很全:</p><p id="52eb90eb-7c83-4450-860c-f429c09b0277" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="09bd0e29-1fc4-419f-adc0-79a50ac8050e" style="font-size:18px;margin:20px 0px;text-align:left;">https://github.com/anandpawara/Real_Time_Image_Animation/blob/master/requirements.txt</span></p><p id="22ebeebd-1f85-40fe-84e4-67720d061db9" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;">直接使用 pip 安装即可:</p><pre id="bb46cbd1-4176-4118-a405-3223f6fa2e65" style="font-size:18px;margin:20px 0px;text-align:left;"><code id="4b8393cb-721f-4874-b1e6-f1905134c1b1" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="5569302b-529f-4ff7-a095-5668ddb5462e" style="font-size:18px;margin:20px 0px;text-align:left;">python</span><span id="9479d22e-989d-4348-9383-d22bed105110" style="font-size:18px;margin:20px 0px;text-align:left;">-m</span><span id="29af0ae1-f123-4390-aeb9-afe804a385d2" style="font-size:18px;margin:20px 0px;text-align:left;">pip</span><span id="28486837-7f48-418b-ad0e-ab9825bb9d50" style="font-size:18px;margin:20px 0px;text-align:left;">install</span><span id="c559771b-7f0f-4791-944d-3c233480d19a" style="font-size:18px;margin:20px 0px;text-align:left;">-r</span><span id="88c012ec-bf2e-4ffd-94d5-fff626d5098d" style="font-size:18px;margin:20px 0px;text-align:left;">requirements</span><span id="7f768523-6930-4001-8870-1a435d29175c" style="font-size:18px;margin:20px 0px;text-align:left;">.txt</span></code></pre><p id="3d16e2e4-957d-4f71-bf46-07cc3979d170" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="0705bdc3-74ac-4fc3-b0f2-1697fa27614f" style="font-size:18px;margin:20px 0px;text-align:left;">此外,为了处理音频和视频,还需要配置 ffmpeg。</span></p><p id="8bdf2d83-4249-401f-93f6-6e362e5a94ad" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="35eab41a-26a5-44cc-ba77-8ecb269071a1" style="font-size:18px;margin:20px 0px;text-align:left;">安装好 ffmpeg 并配置好环境变量即可。</span></p><p id="00991cab-561d-44ad-afca-0b91d510f7da" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="7e8f8588-1c58-4356-9bc9-5c9c822e8438" style="font-size:18px;margin:20px 0px;text-align:left;">ffmpeg 下载地址:</span></p><p id="7526bb07-30ce-4ed1-a143-f34f2fe6b030" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="239f5cf3-2402-4f8c-8138-7f5b3a15dee3" style="font-size:18px;margin:20px 0px;text-align:left;">https://ffmpeg.zeranoe.com/builds/</span></p><h1 id="8ce3b7a9-56f7-411b-9809-ab5fa38b6603" style="font-size:20px;margin:20px 0px;font-weight:700;">4效果实现</h1><p id="8387ef8a-c6dd-4e99-91ff-333e59941bfb" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="d109342b-7a16-4cc6-b928-f2273bb07abe" style="font-size:18px;margin:20px 0px;text-align:left;">实现也非常简单。</span></p><p id="b01fac4b-884d-4eb0-ac60-f1be23ffe3a0" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="584d9a84-3390-4105-81c3-ee112393a654" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="c36d1863-a70c-4aee-8217-e15062297f2a" style="font-size:18px;margin:20px 0px;text-align:left;">首先,整理一下思路:</span></span></p><p id="333ede1d-a7e5-486c-9901-f7915e0c2f72" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="f8b96ea0-48ab-4791-8677-927acc30c6b3" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="42fabe5d-a0e9-49f9-b591-b4e615ebd198" style="font-size:18px;margin:20px 0px;text-align:left;">「Real Time Image Animation」使用一阶运动模型,根据已有视频,让静态图动起来。</span></span></p><div id="00a97346-dceb-4d15-8ba0-a30b5f215059" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/875fa05c921346c8bdf954c68f4cad08~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=ryc1UHz0B0aqgF%2BKA0BtsPnGcnQ%3D" style="width:100%;"></div><p id="d937ac84-c4b7-416d-bcb0-d992ece6dbc5" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="46a3de41-53e1-4026-ab93-f30d82b62d2b" style="font-size:18px;margin:20px 0px;text-align:left;">左图为原始图片,中间为生成结果,右侧为原始视频。</span></p><p id="dd6c3c70-f505-4c7b-a5fe-1258bb42359f" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="24eb6101-ab0d-4c5d-899b-bc4b7671ad01" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="12a683c7-fd15-4594-ba19-7d27f826b201" style="font-size:18px;margin:20px 0px;text-align:left;">但是,这个项目</span><strong id="9d7fe401-239b-4bbf-bf2f-737add68478a" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="c533d242-8b67-48f7-aac9-8340cf4b30a9" style="font-size:18px;margin:20px 0px;text-align:left;">只能处理图像</span></strong><span id="a4763733-1555-41dc-8bda-087e25bb6ee6" style="font-size:18px;margin:20px 0px;text-align:left;">,</span><strong id="e58bf060-cc24-4219-9a4f-f20e003fea2f" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="64680cbc-abf6-4226-b504-6538700d2beb" style="font-size:18px;margin:20px 0px;text-align:left;">不能保留音频</span></strong><span id="f9572f2f-bb64-4d2d-a8d8-cc9c948fa629" style="font-size:18px;margin:20px 0px;text-align:left;">。</span></span></p><p id="0c833e28-7805-43d6-87cc-dfd8f67c42a8" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="7d22565b-f8ec-4784-9c05-4cf9d60c0025" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="19ebdea3-9fe3-4097-8788-974b64fafa2e" style="font-size:18px;margin:20px 0px;text-align:left;">所以,我们需要先将音频保存,再将处理好的视频和音频进行合成。</span></span></p><p id="c4b4671b-6e69-43b2-bdd5-7194233d9a43" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="5df24bca-7f29-451a-acc4-f98b879ab8c9" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="2692022b-7fb0-42ff-b87b-4e75a081cd05" style="font-size:18px;margin:20px 0px;text-align:left;">这个功能,就用我们下载好的 ffmpeg 实现。</span></span></p><p id="1b6b1a65-838a-4b0a-a511-3260c1985c65" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="3808ee28-82f2-44ee-98ef-2396ffed962e" style="font-size:18px;margin:20px 0px;text-align:left;">编写如下代码:</span></p><pre id="b05de659-4232-463b-8a42-6876b565ebf1" style="font-size:18px;margin:20px 0px;text-align:left;"><code id="6a9ef995-a1f1-4fd5-b038-88b351bdde5e" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="2b93f156-ff16-4c1e-a479-babb81096307" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>subprocessimport osfrom PIL<span id="dad6bf85-ce76-4e90-818a-df58957c5dea" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>Imagedef video2mp3(file_name):<span id="ce471b15-131b-479e-9614-82a478905a28" style="font-size:18px;margin:20px 0px;text-align:left;">"""    将视频转为音频    :param file_name: 传入视频文件的路径    :return:    """</span>outfile_name = file_name.split(<span id="9d2647e6-561f-494c-a0ea-1e804c848d9f" style="font-size:18px;margin:20px 0px;text-align:left;">.</span>)[<span id="0b988ac6-15c7-4233-8672-2bd8cd46f223" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>] +<span id="2a722bfd-47f9-4523-9b79-3c6fc1df2ee6" style="font-size:18px;margin:20px 0px;text-align:left;">.mp3</span>cmd =<span id="e9e2bf6e-3391-4bb3-8302-9303d5b6feb0" style="font-size:18px;margin:20px 0px;text-align:left;">ffmpeg -i</span>+ file_name +<span id="f49bad65-c043-4620-aca9-1a925043c384" style="font-size:18px;margin:20px 0px;text-align:left;">-f mp3</span>+ outfile_name    subprocess.call(cmd, shell=<span id="7aeee1d9-2db9-498c-970d-03f487495f0f" style="font-size:18px;margin:20px 0px;text-align:left;">True</span>)<span id="21ad927e-ce03-4f9c-9d57-ba1ec5d29dcc" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="17881551-647e-4624-a572-43f209521f53" style="font-size:18px;margin:20px 0px;text-align:left;">def</span><span id="1a81058b-b6b9-49bf-9f2b-97316496fc21" style="font-size:18px;margin:20px 0px;text-align:left;">video_add_mp3</span><span id="0305aa18-d63e-4a66-8bd8-fecb45e6247d" style="font-size:18px;margin:20px 0px;text-align:left;">(file_name, mp3_file)</span>:</span><span id="477c5df6-f05e-4239-b497-b8551109b2e0" style="font-size:18px;margin:20px 0px;text-align:left;">"""   视频添加音频    :param file_name: 传入视频文件的路径    :param mp3_file: 传入音频文件的路径    :return:    """</span>outfile_name = file_name.split(<span id="365ae86c-13a0-4524-bca1-3a56888e770e" style="font-size:18px;margin:20px 0px;text-align:left;">.</span>)[<span id="f0dc9dcf-25d4-4e68-b254-76bec0659fcf" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>] +<span id="f8b613e8-d52a-4089-a36b-6bb2025ef377" style="font-size:18px;margin:20px 0px;text-align:left;">-f.mp4</span>subprocess.call(<span id="2c65de85-6227-4cdb-9e33-b475db8c9183" style="font-size:18px;margin:20px 0px;text-align:left;">ffmpeg -i</span>+ file_name                  +<span id="b5be7e53-ad13-4cc0-b1e1-a8e37d7557d9" style="font-size:18px;margin:20px 0px;text-align:left;">-i</span>+ mp3_file +<span id="2dd4b924-fc6b-4542-a2aa-dd3293e6a9b2" style="font-size:18px;margin:20px 0px;text-align:left;">-strict -2 -f mp4</span>+ outfile_name, shell=<span id="02fe4602-4fca-436c-a4c2-84526f5e7e50" style="font-size:18px;margin:20px 0px;text-align:left;">True</span>)</code></pre><p id="c811f554-1bd6-4e37-a992-d7d492f8d5ff" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="368f0fc4-9ae4-439c-be70-f305f1b7d649" style="font-size:18px;margin:20px 0px;text-align:left;">搞定,视频转音频,以及音频合成都搞定了。</span></p><p id="5641b133-9019-402e-82c8-9980622ff8a3" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="7a1edeff-fcf6-4b37-8741-399bb9675f0d" style="font-size:18px;margin:20px 0px;text-align:left;">我们需要对「Real Time Image Animation」这个项目进行修改,修改 image_animation.py 文件。</span></p><pre id="6ff04c58-554b-4af1-aa53-b284321f0b35" style="font-size:18px;margin:20px 0px;text-align:left;"><code id="994c28e4-f8b7-447c-9805-dc09f9510f59" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="09071e1b-3ce0-4716-9335-5285a127ba97" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>imageioimport torchfrom tqdm<span id="5e27e398-7d5e-4949-89eb-dbe4eac5457f" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>tqdmfrom animate<span id="f397f67d-bc24-4cf2-aa47-fa8ffb40f18e" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>normalize_kpfrom demo<span id="9ee4df5f-5ab1-4236-af4d-81637e5c9bb2" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>load_checkpointsimport numpy as npimport matplotlib.pyplot as pltimport matplotlib.animation as animationfrom skimage<span id="836d5134-40af-44dc-b513-aa8546effb49" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>img_as_ubytefrom skimage.transform<span id="7d2e4a09-c73c-4b02-b9b1-6cf5bcbf1bc0" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>resizeimport cv2import osimport argparseimport subprocessimport osfrom PIL<span id="7b45c543-1490-4a74-8830-227418fe9ca1" style="font-size:18px;margin:20px 0px;text-align:left;">import</span>Imagedef video2mp3(file_name):<span id="275eba05-6ec6-4b2d-a9a8-d2de216f91c8" style="font-size:18px;margin:20px 0px;text-align:left;">""</span><span id="628cdc58-5c45-437d-9b84-96602bb937f5" style="font-size:18px;margin:20px 0px;text-align:left;">"    将视频转为音频    :param file_name: 传入视频文件的路径    :return:    "</span><span id="bad728f7-6d05-49cd-93bc-a09d8c3c9e91" style="font-size:18px;margin:20px 0px;text-align:left;">""</span>outfile_name = file_name.split(<span id="96637777-5a7d-49c6-b572-ad6df4a91668" style="font-size:18px;margin:20px 0px;text-align:left;">.</span>)[<span id="b5eb1fb8-f409-4ed9-bd3f-a551211a40ce" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>] +<span id="342eed04-007a-488a-96af-4c9408b27ba9" style="font-size:18px;margin:20px 0px;text-align:left;">.mp3</span>cmd =<span id="d37fcbd6-c1a5-4db5-bddf-056c3c613ac6" style="font-size:18px;margin:20px 0px;text-align:left;">ffmpeg -i</span>+ file_name +<span id="67da5932-dc60-4e0a-841b-98ff1bbfd03b" style="font-size:18px;margin:20px 0px;text-align:left;">-f mp3</span>+ outfile_name<span id="ad02186e-961f-4f02-9796-cecff4d137ff" style="font-size:18px;margin:20px 0px;text-align:left;">print</span>(cmd)    subprocess.call(cmd, shell=True)def video_add_mp3(file_name, mp3_file):<span id="8b3b8a86-a470-4acf-a50c-c21eeaad25ec" style="font-size:18px;margin:20px 0px;text-align:left;">""</span><span id="d2e3b253-0124-4f19-b5f3-6f8beea31ccf" style="font-size:18px;margin:20px 0px;text-align:left;">"   视频添加音频    :param file_name: 传入视频文件的路径    :param mp3_file: 传入音频文件的路径    :return:    "</span><span id="a322d79f-43dc-4514-ad3c-ba61db388a81" style="font-size:18px;margin:20px 0px;text-align:left;">""</span>outfile_name = file_name.split(<span id="1f86a71e-efb0-4d5d-9c9b-5455343fc15f" style="font-size:18px;margin:20px 0px;text-align:left;">.</span>)[<span id="07a5d804-5626-47f8-9fb5-cf4b10ec7ee5" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>] +<span id="e51706d0-72cd-44d4-bab8-826536abb111" style="font-size:18px;margin:20px 0px;text-align:left;">-f.mp4</span>subprocess.call(<span id="c691ffd9-13b8-4785-95ed-ff0effa81a26" style="font-size:18px;margin:20px 0px;text-align:left;">ffmpeg -i</span>+ file_name                  +<span id="6ede03df-ccfe-4c0d-bb0d-fcf9049a050f" style="font-size:18px;margin:20px 0px;text-align:left;">-i</span>+ mp3_file +<span id="35e92701-a9e7-497f-bb29-aef6bbdfe017" style="font-size:18px;margin:20px 0px;text-align:left;">-strict -2 -f mp4</span>+ outfile_name, shell=True)ap = argparse.ArgumentParser()ap.add_argument(<span id="8a6a644f-94b8-4075-bb57-687f3ddf17d8" style="font-size:18px;margin:20px 0px;text-align:left;">"-i"</span>,<span id="fc08464b-0598-4a32-ba11-b2095a9ff87a" style="font-size:18px;margin:20px 0px;text-align:left;">"--input_image"</span>, required=True,help=<span id="aa866321-0188-4c34-8a5e-644862fa7fde" style="font-size:18px;margin:20px 0px;text-align:left;">"Path to image to animate"</span>)ap.add_argument(<span id="b96e7c73-774b-42e5-b1d5-77e6413e37c0" style="font-size:18px;margin:20px 0px;text-align:left;">"-c"</span>,<span id="a921067a-0108-4c56-aa80-c07d40a32f8e" style="font-size:18px;margin:20px 0px;text-align:left;">"--checkpoint"</span>, required=True,help=<span id="1cf218af-1953-4c98-b77a-d3fc1ba750f0" style="font-size:18px;margin:20px 0px;text-align:left;">"Path to checkpoint"</span>)ap.add_argument(<span id="11f8b8b4-b175-48e5-9fb2-9db7cdb6d386" style="font-size:18px;margin:20px 0px;text-align:left;">"-v"</span>,<span id="6fa076dd-3c38-486d-a44f-1be8ef0e884e" style="font-size:18px;margin:20px 0px;text-align:left;">"--input_video"</span>, required=False, help=<span id="fca9674f-2264-4242-9974-23460ba85c36" style="font-size:18px;margin:20px 0px;text-align:left;">"Path to video input"</span>)args = vars(ap.parse_args())<span id="a0fc0459-df2b-4a3b-96b2-10da4bbd0827" style="font-size:18px;margin:20px 0px;text-align:left;">print</span>(<span id="fa8b9b4e-7fdc-4a7d-b2c4-fbe129573733" style="font-size:18px;margin:20px 0px;text-align:left;">" loading source image and checkpoint..."</span>)source_path = args[<span id="feca0666-6e78-4e5a-a086-a8cb0a41ade2" style="font-size:18px;margin:20px 0px;text-align:left;">input_image</span>]checkpoint_path = args[<span id="41ff2b5e-80a7-4b6e-becf-55cb547c86b6" style="font-size:18px;margin:20px 0px;text-align:left;">checkpoint</span>]<span id="35d1506b-2059-43c3-9a1a-d6763828d148" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>args[<span id="58518493-59b1-49b2-9e28-338b9ed06c64" style="font-size:18px;margin:20px 0px;text-align:left;">input_video</span>]:    video_path = args[<span id="66f2f4c9-f06e-4cdf-91f4-6efca9d08a5b" style="font-size:18px;margin:20px 0px;text-align:left;">input_video</span>]<span id="273eb5ad-8eac-449b-bba2-8f94ccd0b95b" style="font-size:18px;margin:20px 0px;text-align:left;">else</span>:    video_path = Nonesource_image = imageio.imread(source_path)source_image = resize(source_image,(<span id="42fba26b-0566-4a9b-99b9-95fe935ae7b0" style="font-size:18px;margin:20px 0px;text-align:left;">256</span>,<span id="29e77551-fa71-4462-acc5-5c39367cce04" style="font-size:18px;margin:20px 0px;text-align:left;">256</span>))[..., :<span id="3cb3f2ad-dbcb-4147-bc73-fa4ed36b1091" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>]generator, kp_detector = load_checkpoints(config_path=<span id="59108965-5c56-403c-8859-27197721b0aa" style="font-size:18px;margin:20px 0px;text-align:left;">config/vox-256.yaml</span>, checkpoint_path=checkpoint_path)<span id="116be4fa-0c12-48c5-93ae-5cc726600c6e" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>not os.path.exists(<span id="1831ea8f-961f-4c56-9aea-3e7de9422675" style="font-size:18px;margin:20px 0px;text-align:left;">output</span>):    os.mkdir(<span id="aed9821c-c187-40ba-b2f4-325fe751a9c5" style="font-size:18px;margin:20px 0px;text-align:left;">output</span>)relative=Trueadapt_movement_scale=Truecpu=Falseif video_path:<span id="c12ecaf2-f4e3-4606-a1f2-26052a91246f" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>= cv2.VideoCapture(video_path)<span id="2e78ba5d-ad30-417f-8e0c-00d34a8f6ca7" style="font-size:18px;margin:20px 0px;text-align:left;">print</span>(<span id="4f8c5e1f-d680-4bb2-8c28-ab2d42fa6300" style="font-size:18px;margin:20px 0px;text-align:left;">" Loading video from the given path"</span>)<span id="f1326c32-3d4c-4736-ae74-b98ce5edaca7" style="font-size:18px;margin:20px 0px;text-align:left;">else</span>:<span id="331af397-bb56-43b0-aab5-5775d6803716" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>= cv2.VideoCapture(<span id="ee201e3b-498f-4b61-984f-342a2b9fb852" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>)<span id="a6afff22-51b5-4acf-82be-5164c68e85a1" style="font-size:18px;margin:20px 0px;text-align:left;">print</span>(<span id="7c9f18d1-354d-4fb7-acb6-479cb86a21d8" style="font-size:18px;margin:20px 0px;text-align:left;">" Initializing front camera..."</span>)fps =<span id="5b9bda83-4309-49b0-95ae-09093bf01e38" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>.get(cv2.CAP_PROP_FPS)size = (<span id="2636d911-bcf0-4484-b2d5-87c23eaf25f1" style="font-size:18px;margin:20px 0px;text-align:left;">int</span>(<span id="0eb8f7f8-13f7-4b35-87aa-a5f0b2a5bf6f" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>.get(cv2.CAP_PROP_FRAME_WIDTH)),<span id="af1a2115-4d23-4d18-98bf-95e1acac6e07" style="font-size:18px;margin:20px 0px;text-align:left;">int</span>(<span id="e74333e2-5993-43ef-a7a0-22615c67e71b" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>.get(cv2.CAP_PROP_FRAME_HEIGHT)))video2mp3(file_name = video_path)fourcc = cv2.VideoWriter_fourcc(<span id="b4a1e8aa-dd19-4fd1-bd8e-f07f0d6fc41d" style="font-size:18px;margin:20px 0px;text-align:left;">M</span>,<span id="a31ff586-1faa-4b26-967d-f9c4db2d2275" style="font-size:18px;margin:20px 0px;text-align:left;">P</span>,<span id="2edc9eb7-5821-4f99-8655-f5fbab42c086" style="font-size:18px;margin:20px 0px;text-align:left;">E</span>,<span id="755fa949-09c2-4e7d-8e11-f843f536d8a0" style="font-size:18px;margin:20px 0px;text-align:left;">G</span>)out1 = cv2.VideoWriter(<span id="386ea021-639a-434a-8535-b3d008ff3e53" style="font-size:18px;margin:20px 0px;text-align:left;">output/test.avi</span>, fourcc, fps, (<span id="84975ee0-3e04-4124-8f2f-0c502f33a51b" style="font-size:18px;margin:20px 0px;text-align:left;">256</span>*<span id="c3d803e6-51a9-42f6-baf8-161536aef8ab" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>,<span id="5bd488fe-13f3-40f4-a445-c7b533c87685" style="font-size:18px;margin:20px 0px;text-align:left;">256</span>), True)out1 = cv2.VideoWriter(<span id="269a64cb-4f82-4e8b-acef-06bb7d9bba0a" style="font-size:18px;margin:20px 0px;text-align:left;">output/test.mp4</span>, fourcc, fps, size, True)cv2_source = cv2.cvtColor(source_image.astype(<span id="931dd476-ee83-4db0-af34-d3809349cf81" style="font-size:18px;margin:20px 0px;text-align:left;">float32</span>),cv2.COLOR_BGR2RGB)with torch.no_grad() :    predictions = []    source = torch.tensor(source_image.astype(np.<span id="f7495630-81ad-446c-96d6-fdc2f6364b54" style="font-size:18px;margin:20px 0px;text-align:left;">float32</span>)).permute(<span id="46fbd1b2-cfbd-4436-9786-40c929735f4a" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>,<span id="915e33b5-2171-412f-8980-ad200f181ddd" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>,<span id="65c2ddd4-b997-4ba1-b8a1-1896606389f9" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>,<span id="0fd9753a-511b-4184-b770-ba9c97f97750" style="font-size:18px;margin:20px 0px;text-align:left;">2</span>)<span id="86361766-95b0-4b85-abe7-7f070ac6ec1b" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>not cpu:      source = source.cuda()    kp_source = kp_detector(source)    count =<span id="f2ded925-74b4-4a7f-9320-31b03275f176" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>while(True):      ret, frame =<span id="dd51e150-68e4-4008-94dc-f33014b89288" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>.read()      frame = cv2.flip(frame,<span id="d8905062-cdce-482f-afa2-fa4126e888ad" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>)<span id="f9f7a630-f1f9-4c39-8a70-6d332884990a" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>ret == True:<span id="b61ebdd1-8aa8-4506-9038-84d5ae3cd463" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>not video_path:                x =<span id="3ac1dbe0-7848-4ec2-b916-b80db343a1e4" style="font-size:18px;margin:20px 0px;text-align:left;">143</span>y =<span id="1beb8867-3e65-40ea-9417-4c63073d2bca" style="font-size:18px;margin:20px 0px;text-align:left;">87</span>w =<span id="070dc378-2835-4179-9ded-8cb253c1c60f" style="font-size:18px;margin:20px 0px;text-align:left;">322</span>h =<span id="1f724122-4c16-454e-b228-8b58102584eb" style="font-size:18px;margin:20px 0px;text-align:left;">322</span>frame = frame            frame1 = resize(frame,(<span id="65e20664-1b75-4a01-9a58-22254280d967" style="font-size:18px;margin:20px 0px;text-align:left;">256</span>,<span id="59a8f847-b576-4dfc-8cb3-641963fde6d0" style="font-size:18px;margin:20px 0px;text-align:left;">256</span>))[..., :<span id="7a7cb9a7-e23b-431f-906e-7f1cb2646d98" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>]<span id="5847efeb-a7af-432c-859b-f546fc2f7e48" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>count ==<span id="efe4bb3a-93f1-4d70-b650-f0c187c86589" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>:                source_image1 = frame1                source1 = torch.tensor(source_image1.astype(np.<span id="2df952d6-84f1-407b-8232-2c5684d3b3a6" style="font-size:18px;margin:20px 0px;text-align:left;">float32</span>)).permute(<span id="952a742f-fb03-4194-90ad-252f6cf2eeb2" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>,<span id="276ce206-502d-4a9e-8e19-fb4bfd60c750" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>,<span id="5c28e5e8-cfcb-4b99-bcaf-eab0d8f6ef71" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>,<span id="83e4c919-3c25-41b5-babf-3b2185816b2e" style="font-size:18px;margin:20px 0px;text-align:left;">2</span>)                kp_driving_initial = kp_detector(source1)            frame_test = torch.tensor(frame1.astype(np.<span id="6f706399-c145-4ef9-b432-b0f61f4c2619" style="font-size:18px;margin:20px 0px;text-align:left;">float32</span>)).permute(<span id="437f90c0-34fb-4e41-a0a1-c910b84e3f06" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>,<span id="74129634-0ff1-4773-94ba-0abffc7a7963" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>,<span id="4cdc0a4c-d843-4cd7-9bd1-71f7a40dd6ea" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>,<span id="82d2007a-d16b-4d94-a049-42e499531de3" style="font-size:18px;margin:20px 0px;text-align:left;">2</span>)            driving_frame = frame_test<span id="2c021961-319a-4b34-9a7d-c507f9dd4624" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>not cpu:                driving_frame = driving_frame.cuda()            kp_driving = kp_detector(driving_frame)            kp_norm = normalize_kp(kp_source=kp_source,                              kp_driving=kp_driving,                              kp_driving_initial=kp_driving_initial,                                 use_relative_movement=relative,                              use_relative_jacobian=relative,                                 adapt_movement_scale=adapt_movement_scale)            out = generator(source, kp_source=kp_source, kp_driving=kp_norm)            predictions.<span id="bfc0681c-cd85-46cc-b5d6-07e1d8e60d24" style="font-size:18px;margin:20px 0px;text-align:left;">append</span>(np.transpose(out[<span id="eaf02044-bd51-4ef3-934e-26352c30c8ba" style="font-size:18px;margin:20px 0px;text-align:left;">prediction</span>].data.cpu().numpy(), [<span id="da37cdab-153f-4988-8006-451e57974be5" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>,<span id="0bb53372-c244-43a6-9e68-1ce3c3fc01b3" style="font-size:18px;margin:20px 0px;text-align:left;">2</span>,<span id="dab46539-9884-4718-9ea7-a737c142fc5a" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>,<span id="a9b4c6f0-1111-48c2-b1aa-a7925f3004ae" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>])[<span id="40a14fd8-273c-45dd-a7d6-70431f8be6da" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>])            im = np.transpose(out[<span id="6d39fdba-dee7-4ff8-8a3d-8d54b75ef343" style="font-size:18px;margin:20px 0px;text-align:left;">prediction</span>].data.cpu().numpy(), [<span id="16a27349-1ede-44ff-afe4-a76f50a9564f" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>,<span id="e45d8ea8-7f4d-46cf-990a-c71465a748d5" style="font-size:18px;margin:20px 0px;text-align:left;">2</span>,<span id="847edd4d-740b-4ea2-9351-012b6e1c30a9" style="font-size:18px;margin:20px 0px;text-align:left;">3</span>,<span id="bc0e1cf3-b313-4914-a597-88bbd07b2ba9" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>])[<span id="20a13769-d2cc-499c-b908-d275763534c3" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>]            im = cv2.cvtColor(im,cv2.COLOR_RGB2BGR)            joinedFrame = np.concatenate((cv2_source,im,frame1),axis=<span id="066ad196-2c01-46bf-85b0-ae33a14fb5f1" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>)            joinedFrame = np.concatenate((cv2_source,im,frame1),axis=<span id="ed6216d6-55e8-45cb-96bf-a4dfb58666a2" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>)            cv2.imshow(<span id="8d60e3f4-4e28-49f1-ae8b-1767786186d5" style="font-size:18px;margin:20px 0px;text-align:left;">Test</span>,joinedFrame)            out1.write(img_as_ubyte(joinedFrame))            out1.write(img_as_ubyte(im))            count +=<span id="1aafd398-5694-4732-a26c-fedcccb7205d" style="font-size:18px;margin:20px 0px;text-align:left;">1</span><span id="82714061-482a-47e2-a61b-455c3cb92b52" style="font-size:18px;margin:20px 0px;text-align:left;">if</span>cv2.waitKey(<span id="2e3ec579-7df7-43da-b320-f880c9e544a0" style="font-size:18px;margin:20px 0px;text-align:left;">20</span>) &amp;<span id="bb9e91ee-bbb7-4527-a70d-53bf613e1be2" style="font-size:18px;margin:20px 0px;text-align:left;">0xFF</span>== ord(<span id="22ac1ffe-390d-4bb5-a44d-07a5cb2efbeb" style="font-size:18px;margin:20px 0px;text-align:left;">q</span>):<span id="967935df-4ea8-4ad4-adfd-12761386f2ad" style="font-size:18px;margin:20px 0px;text-align:left;">break</span><span id="5cbf17a8-e568-4ab2-9125-934879cccc32" style="font-size:18px;margin:20px 0px;text-align:left;">else</span>:<span id="aa67b83d-33d7-45e6-86f7-2533be1159ef" style="font-size:18px;margin:20px 0px;text-align:left;">break</span><span id="13ceacaa-976f-4930-a13c-1af6133b7880" style="font-size:18px;margin:20px 0px;text-align:left;">cap</span>.release()    out1.release()    cv2.destroyAllWindows()video_add_mp3(file_name=<span id="59d16b40-9846-4ed8-bd76-a7ccc09ca140" style="font-size:18px;margin:20px 0px;text-align:left;">output/test.mp4</span>, mp3_file=video_path.split(<span id="9812d443-0214-4d3e-a8bd-7e13d8c2023c" style="font-size:18px;margin:20px 0px;text-align:left;">.</span>)[<span id="7da086ec-bb07-46f6-9ce8-39263b96acb9" style="font-size:18px;margin:20px 0px;text-align:left;">0</span>] +<span id="788ad38a-2f27-4c56-b652-7bcdf8b51be6" style="font-size:18px;margin:20px 0px;text-align:left;">.mp3</span>)</code></pre><p id="6ae9a95b-74e0-4f52-b022-30f0f6b8d1da" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="87f4bace-44d2-4fff-a833-611577745d1a" style="font-size:18px;margin:20px 0px;text-align:left;">然后下载算法需要的<strong id="5dacdc1c-8656-4bb3-84e2-622b0c8278f8" style="font-size:18px;margin:20px 0px;font-weight:700;">权重文件</strong>和<strong id="757f471c-f3f6-482f-9474-b553d84be129" style="font-size:18px;margin:20px 0px;font-weight:700;">视频图片素材</strong>。</span></p><div id="42a59ada-f4b6-4760-8a7d-32961b5783d5" style="font-size:18px;margin:20px 0px;text-align:left;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-qvj2lq49k0/a7d2638328ad44839e88a335acb683f8~noop.image?_iz=58558&amp;from=article.pc_detail&amp;x-expires=1660566100&amp;x-signature=eoAPlXjIVbc6ALGreUXFfqeSbRE%3D" style="width:100%;"></div><p id="54d894a1-5993-4d4f-a9bd-1ccda60cba88" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="244061dc-28b6-4656-ba7a-b184ffc69a1e" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="cd9103d2-173a-4c61-8a90-fe1331fd62ed" style="font-size:18px;margin:20px 0px;text-align:left;">修改好的</span><strong id="c87084d4-0f05-4464-b7cb-425eea97c660" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="232d1182-45fd-4a1b-a32e-e24b4d0f4ad3" style="font-size:18px;margin:20px 0px;text-align:left;">代码</span></strong><span id="c387ebba-0479-41f3-84a4-36ff18d72ab8" style="font-size:18px;margin:20px 0px;text-align:left;">、</span><strong id="424df25b-1d69-45b4-b51a-a9146abd21e7" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="2ce53a1e-829b-4c30-b8d5-2fbc7a9f0df1" style="font-size:18px;margin:20px 0px;text-align:left;">权重文件</span></strong><span id="20badf05-4de6-4509-a29d-fc3f8d3c269f" style="font-size:18px;margin:20px 0px;text-align:left;">、</span><strong id="f01cd3ca-e6d8-458b-b93f-dc2c8777cc55" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="216f5074-93c7-40fe-969a-a1dd91df8484" style="font-size:18px;margin:20px 0px;text-align:left;">视频图片素材</span></strong><span id="6ac7bf66-d5c7-4f2b-94c8-332b48ee4ae2" style="font-size:18px;margin:20px 0px;text-align:left;">,</span><strong id="4e485bd5-b85f-4c15-956c-43e04dc24a8e" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="f61bda52-246b-42fc-9731-6f3159659d3b" style="font-size:18px;margin:20px 0px;text-align:left;">我都已经打包好了</span></strong><span id="b51cf1d4-e10a-4d60-8a06-7f0170589baa" style="font-size:18px;margin:20px 0px;text-align:left;">,</span><strong id="7e0cffe4-2794-475c-becf-11b08945813f" style="font-size:18px;margin:20px 0px;font-weight:700;"><span id="33bacd8e-92e0-44fc-b568-3e251a1c5d0a" style="font-size:18px;margin:20px 0px;text-align:left;">拿来直接用也可以</span></strong><span id="3061458a-cc39-4c92-bb76-fde93a7ed811" style="font-size:18px;margin:20px 0px;text-align:left;">。</span></span></p><p id="5e6bd944-fc09-4d6e-a0bd-13e4d7270786" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="b3eedf79-9e2b-4367-b8d5-f92216c1519b" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="4767e464-cfb4-4f47-922f-8dc0454ea59a" style="font-size:18px;margin:20px 0px;text-align:left;">下载链接(密码:amz5):</span></span></p><p id="7415c341-917b-40b3-82e8-6a528c19fdb8" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="d15ba7b3-0bb9-47f1-86a1-414ceed96e22" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="48cdb4bd-8075-4c64-a853-6ff004cc9466" style="font-size:18px;margin:20px 0px;text-align:left;">https://pan.baidu.com/s/1TEd7SOaO5mzPaxpOh2pALQ</span></span></p><p id="eaec2ee6-9795-4dca-8fd2-ba6cdd724ff8" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="ea8dc23e-4c3b-4707-9505-5f5a13302c71" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="de270267-3456-4830-9c00-70c5b74f36a6" style="font-size:18px;margin:20px 0px;text-align:left;">运行命令:</span></span></p><pre id="6e89e875-c169-4b54-8ffe-2ef888c91fa1" style="font-size:18px;margin:20px 0px;text-align:left;"><code id="383e084d-6132-4e56-aa46-e844aec3b6b6" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="13fae4a1-419f-4bb9-a6c8-91005b2c60e0" style="font-size:18px;margin:20px 0px;text-align:left;">python</span><span id="8fabf4f1-3fa3-482d-8200-994ee565789a" style="font-size:18px;margin:20px 0px;text-align:left;">image_animation</span><span id="1cbc8ad0-af1f-40d6-8ae2-23dbc0d16579" style="font-size:18px;margin:20px 0px;text-align:left;">.py</span><span id="e61231f9-5781-4074-a485-be6e42a88683" style="font-size:18px;margin:20px 0px;text-align:left;">-i</span><span id="234ac6af-ad5c-4f96-9411-abdb5c2d11b7" style="font-size:18px;margin:20px 0px;text-align:left;">path_to_input_file</span><span id="9e0bae06-b0ca-4b12-9733-9bbd7e8d07e9" style="font-size:18px;margin:20px 0px;text-align:left;">-c</span><span id="50ddeebf-7fdd-4b69-92f7-a26c6b9ad877" style="font-size:18px;margin:20px 0px;text-align:left;">path_to_checkpoint</span><span id="5609c4d5-26c4-4cca-bb07-9411afbe66e4" style="font-size:18px;margin:20px 0px;text-align:left;">-v</span><span id="ad355ca1-c15b-4907-8cef-d6438b223b9f" style="font-size:18px;margin:20px 0px;text-align:left;">path_to_video_file</span></code></pre><p id="d586e8bc-97fa-4252-90e6-68450c6f2f95" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="ab20987e-5654-47aa-b2de-2c1f0e59e657" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="ab678c09-f3ff-4c66-823c-ef2ea4b4bf4f" style="font-size:18px;margin:20px 0px;text-align:left;">path_to_input_file 是输入的模板图片。</span></span></p><p id="66df60f1-2fc3-40c4-b4a0-1011ec2a6b52" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="e26da6b9-8588-4411-ae52-35070a57797d" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="3a1939dd-4bc3-4e53-9cf9-39a1628bf8dd" style="font-size:18px;margin:20px 0px;text-align:left;">path_to_checkpoint 是权重文件路径。</span></span></p><p id="a96c6c2b-a57a-419e-8072-131de494809a" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="7511ceb2-487a-42aa-9dc1-a0b8a6a219a2" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="7b17c008-3610-464d-8078-82e7be7df1ad" style="font-size:18px;margin:20px 0px;text-align:left;">path_to_video_file 是输入的视频文件。</span></span></p><p id="7dadd2fb-9ff0-46a5-abce-3d07ee3d00ec" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="7903686e-742c-44b3-b552-6a48e45d9a60" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="7fe11429-ea50-45c7-bc07-16c9a7c2e4bc" style="font-size:18px;margin:20px 0px;text-align:left;">如果使用我打包好的程序,可以使用如下指令直接运行,获得文章开头的视频:</span></span></p><pre id="9d719754-0ec2-4719-9cfa-ca86a1368dc4" style="font-size:18px;margin:20px 0px;text-align:left;"><code id="31c94f9a-f230-4d8c-8ad4-ef74bdf096b3" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="a22ba4de-aa1c-4b8e-be75-7dab26c0a8e4" style="font-size:18px;margin:20px 0px;text-align:left;">python</span>image_animation.py -i Inputs/trump2.png -c checkpoints/vox-cpk.pth.tar -v<span id="74e79f50-40fc-499c-9577-4b4a99073179" style="font-size:18px;margin:20px 0px;text-align:left;">1</span>.mp4</code></pre><p id="2eb93aad-995d-4961-924d-7e5cbe698b6b" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="3e6f0a6e-ec1d-48d9-b197-151517e7124e" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="1da240a5-ff74-4783-ba15-6c41d838684f" style="font-size:18px;margin:20px 0px;text-align:left;">最后生成的视频存放在 output 文件夹下。</span></span></p><p id="3dbdc91f-6dc2-4422-9916-8e5b92ba529e" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="98f71eb3-a5f9-4448-9980-abd4953d7406" style="font-size:18px;margin:20px 0px;text-align:left;"><span id="816a740a-19a6-400b-a07a-0ce09d2e4b41" style="font-size:18px;margin:20px 0px;text-align:left;">大功告成!</span></span></p><h1 id="ccaa427d-af64-4d53-97ac-1403fc0172aa" style="font-size:20px;margin:20px 0px;font-weight:700;">5最后</h1><p id="85d0db26-d0ca-4ef9-a6fe-e06fe330fe6e" style="font-weight:400;text-align:left;line-height:1.667;margin:20px 0px;font-size:18px;"><span id="55b2ef80-3454-48fa-a690-8eeca53169c3" style="font-size:18px;margin:20px 0px;text-align:left;">算法处理视频的速度很快,用 GPU 几秒钟就能搞定。</span></p>
页: [1]
查看完整版本: AI让图片动起来,特朗普和蒙娜丽莎深情合唱《Unravel》