facefusion/facefusion/uis/components/webcam.py
Henry Ruhs c77493ff9a
Next (#384)
* feat/yoloface (#334)

* added yolov8 to face_detector (#323)

* added yolov8 to face_detector

* added yolov8 to face_detector

* Initial cleanup and renaming

* Update README

* refactored detect_with_yoloface (#329)

* refactored detect_with_yoloface

* apply review

* Change order again

* Restore working code

* modified code (#330)

* refactored detect_with_yoloface

* apply review

* use temp_frame in detect_with_yoloface

* reorder

* modified

* reorder models

* Tiny cleanup

---------

Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com>

* include audio file functions (#336)

* Add testing for audio handlers

* Change order

* Fix naming

* Use correct typing in choices

* Update help message for arguments, Notation based wording approach (#347)

* Update help message for arguments, Notation based wording approach

* Fix installer

* Audio functions (#345)

* Update ffmpeg.py

* Create audio.py

* Update ffmpeg.py

* Update audio.py

* Update audio.py

* Update typing.py

* Update ffmpeg.py

* Update audio.py

* Rename Frame to VisionFrame (#346)

* Minor tidy up

* Introduce audio testing

* Add more todo for testing

* Add more todo for testing

* Fix indent

* Enable venv on the fly

* Enable venv on the fly

* Revert venv on the fly

* Revert venv on the fly

* Force Gradio to shut up

* Force Gradio to shut up

* Clear temp before processing

* Reduce terminal output

* include audio file functions

* Enforce output resolution on merge video

* Minor cleanups

* Add age and gender to face debugger items (#353)

* Add age and gender to face debugger items

* Rename like suggested in the code review

* Fix the output framerate vs. time

* Lip Sync (#356)

* Cli implementation of wav2lip

* - create get_first_item()
- remove non gan wav2lip model
- implement video memory strategy
- implement get_reference_frame()
- implement process_image()
- rearrange crop_mask_list
- implement test_cli

* Simplify testing

* Rename to lip syncer

* Fix testing

* Fix testing

* Minor cleanup

* Cuda 12 installer (#362)

* Make cuda nightly (12) the default

* Better keep legacy cuda just in case

* Use CUDA and ROCM versions

* Remove MacOS options from installer (CoreML include in default package)

* Add lip-syncer support to source component

* Add lip-syncer support to source component

* Fix the check in the source component

* Add target image check

* Introduce more helpers to suite the lip-syncer needs

* Downgrade onnxruntime as of buggy 1.17.0 release

* Revert "Downgrade onnxruntime as of buggy 1.17.0 release"

This reverts commit f4a7ae6824.

* More testing and add todos

* Fix the frame processor API to at least not throw errors

* Introduce dict based frame processor inputs (#364)

* Introduce dict based frame processor inputs

* Forgot to adjust webcam

* create path payloads (#365)

* create index payload to paths for process_frames

* rename to payload_paths

* This code now is poetry

* Fix the terminal output

* Make lip-syncer work in the preview

* Remove face debugger test for now

* Reoder reference_faces, Fix testing

* Use inswapper_128 on buggy onnxruntime 1.17.0

* Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0

* Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0

* Fix lip_syncer occluder & region mask issue

* Fix preview once in case there was no output video fps

* fix lip_syncer custom fps

* remove unused import

* Add 68 landmark functions (#367)

* Add 68 landmark model

* Add landmark to face object

* Re-arrange and modify typing

* Rename function

* Rearrange

* Rearrange

* ignore type

* ignore type

* change type

* ignore

* name

* Some cleanup

* Some cleanup

* Opps, I broke something

* Feat/face analyser refactoring (#369)

* Restructure face analyser and start TDD

* YoloFace and Yunet testing are passing

* Remove offset from yoloface detection

* Cleanup code

* Tiny fix

* Fix get_many_faces()

* Tiny fix (again)

* Use 320x320 fallback for retinaface

* Fix merging mashup

* Upload wave2lip model

* Upload 2dfan2 model and rename internal to face_predictor

* Downgrade onnxruntime for most cases

* Update for the face debugger to render landmark 68

* Try to make detect_face_landmark_68() and detect_gender_age() more uniform

* Enable retinaface testing for 320x320

* Make detect_face_landmark_68() and detect_gender_age() as uniform as … (#370)

* Make detect_face_landmark_68() and detect_gender_age() as uniform as possible

* Revert landmark scale and translation

* Make box-mask for lip-syncer adjustable

* Add create_bbox_from_landmark()

* Remove currently unused code

* Feat/uniface (#375)

* add uniface (#373)

* Finalize UniFace implementation

---------

Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>

* My approach how todo it

* edit

* edit

* replace vertical blur with gaussian

* remove region mask

* Rebase against next and restore method

* Minor improvements

* Minor improvements

* rename & add forehead padding

* Adjust and host uniface model

* Use 2dfan4 model

* Rename to face landmarker

* Feat/replace bbox with bounding box (#380)

* Add landmark 68 to 5 convertion

* Add landmark 68 to 5 convertion

* Keep 5, 5/68 and 68 landmarks

* Replace kps with landmark

* Replace bbox with bounding box

* Reshape face_landmark5_list different

* Make yoloface the default

* Move convert_face_landmark_68_to_5 to face_helper

* Minor spacing issue

* Dynamic detector sizes according to model (#382)

* Dynamic detector sizes according to model

* Dynamic detector sizes according to model

* Undo false commited files

* Add lib syncer model to the UI

* fix halo (#383)

* Bump to 2.3.0

* Update README and wording

* Update README and wording

* Fix spacing

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix, Move mouth mask to face_masker.py

* Apply _vision suffix

* Apply _vision suffix

* increase forehead padding

---------

Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com>
Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>
2024-02-14 14:08:29 +01:00

179 lines
6.5 KiB
Python

from typing import Optional, Generator, Deque, List
import os
import platform
import subprocess
import cv2
import gradio
from time import sleep
from concurrent.futures import ThreadPoolExecutor
from collections import deque
from tqdm import tqdm
import facefusion.globals
from facefusion import logger, wording
from facefusion.content_analyser import analyse_stream
from facefusion.typing import VisionFrame, Face, Fps
from facefusion.face_analyser import get_average_face
from facefusion.processors.frame.core import get_frame_processors_modules, load_frame_processor_module
from facefusion.ffmpeg import open_ffmpeg
from facefusion.vision import normalize_frame_color, read_static_images, unpack_resolution
from facefusion.uis.typing import StreamMode, WebcamMode, ComponentName
from facefusion.uis.core import get_ui_component
WEBCAM_CAPTURE : Optional[cv2.VideoCapture] = None
WEBCAM_IMAGE : Optional[gradio.Image] = None
WEBCAM_START_BUTTON : Optional[gradio.Button] = None
WEBCAM_STOP_BUTTON : Optional[gradio.Button] = None
def get_webcam_capture() -> Optional[cv2.VideoCapture]:
global WEBCAM_CAPTURE
if WEBCAM_CAPTURE is None:
if platform.system().lower() == 'windows':
webcam_capture = cv2.VideoCapture(0, cv2.CAP_DSHOW)
else:
webcam_capture = cv2.VideoCapture(0)
if webcam_capture and webcam_capture.isOpened():
WEBCAM_CAPTURE = webcam_capture
return WEBCAM_CAPTURE
def clear_webcam_capture() -> None:
global WEBCAM_CAPTURE
if WEBCAM_CAPTURE:
WEBCAM_CAPTURE.release()
WEBCAM_CAPTURE = None
def render() -> None:
global WEBCAM_IMAGE
global WEBCAM_START_BUTTON
global WEBCAM_STOP_BUTTON
WEBCAM_IMAGE = gradio.Image(
label = wording.get('uis.webcam_image')
)
WEBCAM_START_BUTTON = gradio.Button(
value = wording.get('uis.start_button'),
variant = 'primary',
size = 'sm'
)
WEBCAM_STOP_BUTTON = gradio.Button(
value = wording.get('uis.stop_button'),
size = 'sm'
)
def listen() -> None:
start_event = None
webcam_mode_radio = get_ui_component('webcam_mode_radio')
webcam_resolution_dropdown = get_ui_component('webcam_resolution_dropdown')
webcam_fps_slider = get_ui_component('webcam_fps_slider')
if webcam_mode_radio and webcam_resolution_dropdown and webcam_fps_slider:
start_event = WEBCAM_START_BUTTON.click(start, inputs = [ webcam_mode_radio, webcam_resolution_dropdown, webcam_fps_slider ], outputs = WEBCAM_IMAGE)
WEBCAM_STOP_BUTTON.click(stop, cancels = start_event)
change_two_component_names : List[ComponentName] =\
[
'frame_processors_checkbox_group',
'face_swapper_model_dropdown',
'face_enhancer_model_dropdown',
'frame_enhancer_model_dropdown',
'lip_syncer_model_dropdown',
'source_image'
]
for component_name in change_two_component_names:
component = get_ui_component(component_name)
if component:
component.change(update, cancels = start_event)
def start(webcam_mode : WebcamMode, webcam_resolution : str, webcam_fps : Fps) -> Generator[VisionFrame, None, None]:
facefusion.globals.face_selector_mode = 'one'
facefusion.globals.face_analyser_order = 'large-small'
source_frames = read_static_images(facefusion.globals.source_paths)
source_face = get_average_face(source_frames)
stream = None
if webcam_mode in [ 'udp', 'v4l2' ]:
stream = open_stream(webcam_mode, webcam_resolution, webcam_fps) # type: ignore[arg-type]
webcam_width, webcam_height = unpack_resolution(webcam_resolution)
webcam_capture = get_webcam_capture()
if webcam_capture and webcam_capture.isOpened():
webcam_capture.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*'MJPG')) # type: ignore[attr-defined]
webcam_capture.set(cv2.CAP_PROP_FRAME_WIDTH, webcam_width)
webcam_capture.set(cv2.CAP_PROP_FRAME_HEIGHT, webcam_height)
webcam_capture.set(cv2.CAP_PROP_FPS, webcam_fps)
for capture_frame in multi_process_capture(source_face, webcam_capture, webcam_fps):
if webcam_mode == 'inline':
yield normalize_frame_color(capture_frame)
else:
try:
stream.stdin.write(capture_frame.tobytes())
except Exception:
clear_webcam_capture()
yield None
def multi_process_capture(source_face : Face, webcam_capture : cv2.VideoCapture, webcam_fps : Fps) -> Generator[VisionFrame, None, None]:
with tqdm(desc = wording.get('processing'), unit = 'frame', ascii = ' =', disable = facefusion.globals.log_level in [ 'warn', 'error' ]) as progress:
with ThreadPoolExecutor(max_workers = facefusion.globals.execution_thread_count) as executor:
futures = []
deque_capture_frames : Deque[VisionFrame] = deque()
while webcam_capture and webcam_capture.isOpened():
_, capture_frame = webcam_capture.read()
if analyse_stream(capture_frame, webcam_fps):
return
future = executor.submit(process_stream_frame, source_face, capture_frame)
futures.append(future)
for future_done in [ future for future in futures if future.done() ]:
capture_frame = future_done.result()
deque_capture_frames.append(capture_frame)
futures.remove(future_done)
while deque_capture_frames:
progress.update()
yield deque_capture_frames.popleft()
def update() -> None:
for frame_processor in facefusion.globals.frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
while not frame_processor_module.post_check():
logger.disable()
sleep(0.5)
logger.enable()
def stop() -> gradio.Image:
clear_webcam_capture()
return gradio.Image(value = None)
def process_stream_frame(source_face : Face, target_vision_frame : VisionFrame) -> VisionFrame:
for frame_processor_module in get_frame_processors_modules(facefusion.globals.frame_processors):
logger.disable()
if frame_processor_module.pre_process('stream'):
logger.enable()
target_vision_frame = frame_processor_module.process_frame(
{
'source_face': source_face,
'reference_faces': None,
'source_audio_frame': None,
'target_vision_frame': target_vision_frame
})
return target_vision_frame
def open_stream(stream_mode : StreamMode, stream_resolution : str, stream_fps : Fps) -> subprocess.Popen[bytes]:
commands = [ '-f', 'rawvideo', '-pix_fmt', 'bgr24', '-s', stream_resolution, '-r', str(stream_fps), '-i', '-']
if stream_mode == 'udp':
commands.extend([ '-b:v', '2000k', '-f', 'mpegts', 'udp://localhost:27000?pkt_size=1316' ])
if stream_mode == 'v4l2':
try:
device_name = os.listdir('/sys/devices/virtual/video4linux')[0]
if device_name:
commands.extend([ '-f', 'v4l2', '/dev/' + device_name ])
except FileNotFoundError:
logger.error(wording.get('stream_not_loaded').format(stream_mode = stream_mode), __name__.upper())
return open_ffmpeg(commands)