[CVPR 2024] AnyDoor: Zero-shot Object-level Image Customization
github.com/ali-vilab/AnyDoor.
写在前面:
【论文速读】按照#论文十问#提炼出论文核心知识点,方便相关科研工作者快速掌握论文内容。过程中并不对论文相关内容进行翻译。博主认为翻译难免会损坏论文的原本含义,也鼓励诸位入门级科研人员阅读文献原文。
文章目录
- 01 现有工作的不足?
- 02 文章解决了什么问题?
- 03 关键的解决方案是什么?
- 04 主要的贡献是什么?
- 05 有哪些相关的工作?
- 06 方法具体是如何实现的?
- 07 论文中的实验是如何设计的?
- 08 实验结果和对比效果如何?
- 09 消融研究告诉了我们什么?
- 10 这个工作还可以如何优化?
- 参考文献
01 现有工作的不足?
Local image editing: those methods could only give coarse guidance for generation and often fail to synthesize ID-consistent results for untrained new concepts
Customized image generation: Although they could generate high-fidelity images, the user could not specify the scenario and the location of the target object. Besides, the time-consuming finetuning impedes them from being used in large-scale applications.
Image harmonization: these methods only explore the low-level changes, editing the structure, view, and pose of the foreground objects, or generating the shadows and reflections are not taken into consideration.
02 文章解决了什么问题?
This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations with desired shapes.
AnyDoor is able to generate ID-consistent compositions with high quality in zero-shot.
03 关键的解决方案是什么?
Instead of tuning parameters for each object, our model is trained only once and effortlessly generalizes to diverse object-scene combinations at the inference stage. Such a challenging zero-shot setting requires an adequate characterization of a certain object.
- we complement the commonly used identity feature with detail features, which are carefully designed to maintain appearance details yet allow versatile local variations (e.g., lighting, orientation, posture, etc.), supporting the object in favorably blending with different surroundings.
- We further propose to borrow knowledge from video datasets, where we can observe various forms (i.e., along the time axis) of a single object, leading to stronger model generalizability and robustness.
04 主要的贡献是什么?
- We present AnyDoor for object teleportation. The core idea is to use a discriminative ID extractor and a frequency aware detail extractor to characterize the target object.
- Trained on a large combination of video and image data, we composite the object at the specific location of the scene image with effective shape control.
- AnyDoor provides a universal solution for general region-to-region mapping tasks and could be profitable for various applications.
05 有哪些相关的工作?
- Stable Diffusion [41],
- IP-Adapter [58],
- Paint-by-Example [56]
- Graphit [16]
- DreamBooth [42]
- Custom Diffusion [27]
- Cones [33]
06 方法具体是如何实现的?
In this paper, we investigate “object teleportation”, which means accurately and seamlessly placing the target object into the desired location of the scene image.
we re-generate a box/mask-marked local region of a scene image by taking the target object as the template.
- we represent the target object with identity and detail-related features,
- then composite them with the interaction of the background scene.
- we use an ID extractor to produce discriminative ID tokens and delicately design a frequency-aware detail extractor to get detail maps as a supplement.
- We inject the ID tokens and the detail maps into a pre-trained text-to-image diffusion model as guidance to generate the desired composition.
- To make the generated content more customizable, we explore leveraging additional controls (e.g. user-drawn masks) to indicate the shape/poses of the object.
- To learn customized object generation with high diversities, we collect image pairs for the same object from videos to learn the appearance variations, and also leverage largescale statistic images to guarantee the scenario diversity.
High frequencyy map
The training supervision is a mean square error loss as:
07 论文中的实验是如何设计的?
During inference, given a scene image and a location box, we expand the box into a square with an amplifier ratio of 2.0.
For quantitative results, we construct a new benchmark with 30 new concepts provided by DreamBooth [42] for the target images. For the scene image, we manually pick 80 images with boxes in COCO-Val [31]. Thus we generate 2,400 images for the object-scene combinations. We also make qualitative analysis on VitonHDtest [13] to validate the performance for virtual try-on.
we follow DreamBooth [42] to calculate the CLIPScore and DINO-Score, as these metrics could reflect the similarity between the generated region and the target object. we organize user studies with a group of 15 annotators to rate the generated results from the perspective of fidelity, quality, and diversity.
08 实验结果和对比效果如何?
Extensive experiments demonstrate the superiority of our approach over existing alternatives as well as its great potential in real-world applications, such as virtual try-on, shape editing, and object swapping
Comparisons with Reference-based methods.
Firure 5 show that previous reference-based methods could only keep the semantic consistency with distinguishing features like the dog face on the backpack, and coarse granites of patterns like the color of the sloth toy. However, as those new concepts are not included in the training category, their generation results are far from ID-consistent. In contrast, our AnyDoor shows promising performance for zero-shot image customization with highly-faithful details.
Comparisons with Tuning-based methods.
User study.
09 消融研究告诉了我们什么?
Core components.
ID extractor.
Detail extractor.
More Applications
10 这个工作还可以如何优化?
It still struggles with fine details like the small characters or logos. This issue might be solved by collecting related training data, enlarging the resolution, and training better VAE decoders.
参考文献
[1] Ali Athar, Jonathon Luiten, Paul Voigtlaender, Tarasha Khurana, Achal Dave, Bastian Leibe, and Deva Ramanan. Burst: A benchmark for unifying object recognition, segmentation and tracking in video. In WACV, 2023. 4 [2] Omri Avrahami, Dani Lischinski, and Ohad Fried. Blended diffusion for text-driven editing of natural images. In CVPR, 2022. 2 [3] Omri Avrahami, Kfir Aberman, Ohad Fried, Daniel CohenOr, and Dani Lischinski. Break-a-scene: Extracting multiple concepts from a single image. In SIGGRAPH Asia, 2023. 2 [4] Ali Borji, Ming-Ming Cheng, Huaizu Jiang, and Jia Li. Salient object detection: A benchmark. TIP, 2015. 4 [5] Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, and Yinqiang Zheng. Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing. In ICCV, 2023. 2 [6] Arantxa Casanova, Marl`ene Careil, Adriana RomeroSoriano, Christopher J Pal, Jakob Verbeek, and Michal Drozdzal. Controllable image generation via collage representations. arXiv:2304.13722, 2023. 3 [7] Bor-Chun Chen and Andrew Kae. Toward realistic image compositing with adversarial learning. In CVPR, 2019. 2 [8] Haoxing Chen, Zhangxuan Gu, Yaohui Li, Jun Lan, Changhua Meng, Weiqiang Wang, and Huaxiong Li. Hierarchical dynamic image harmonization. In ACMMM, 2022. 2 [9] Hong Chen, Yipeng Zhang, Xin Wang, Xuguang Duan, Yuwei Zhou, and Wenwu Zhu. Disenbooth: Disentangled parameter-efficient tuning for subject-driven text-to-image generation. arXiv:2305.03374, 2023. 2 [10] Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Rui, Xuhui Jia, Ming-Wei Chang, and William W Cohen. Subjectdriven text-to-image generation via apprenticeship learning. In NeurIPS, 2023. 6 [11] Xi Chen, Zhiyan Zhao, Feiwu Yu, Yilei Zhang, and Manni Duan. Conditional diffusion for interactive segmentation. In ICCV, 2021. 3 [12] Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, and Hengshuang Zhao. Focalclick: towards practical interactive image segmentation. In CVPR, 2022. 3 [13] Seunghwan Choi, Sunghyun Park, Minsoo Lee, and Jaegul Choo. Viton-hd: High-resolution virtual try-on via misalignment-aware normalization. In CVPR, 2021. 4, 5, 8 [14] Wenyan Cong, Jianfu Zhang, Li Niu, Liu Liu, Zhixin Ling, Weiyuan Li, and Liqing Zhang. Dovenet: Deep image harmonization via domain verification. In CVPR, 2020. 2, 4 [15] Wenyan Cong, Xinhao Tao, Li Niu, Jing Liang, Xuesong Gao, Qihao Sun, and Liqing Zhang. High-resolution image harmonization via collaborative dual transformations. In CVPR, 2022. 2 [16] Graphit Contributors. Graphit: A unified framework for diverse image editing tasks. https://github.com/ navervision/Graphit, 2023. 5, 6, 7 [17] Henghui Ding, Chang Liu, Shuting He, Xudong Jiang, Philip HS Torr, and Song Bai. Mose: A new dataset for video object segmentation in complex scenes. In ICCV, 2023. 4 [18] Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. An image is worth one word: Personalizing text-to-image generation using textual inversion. In ICLR, 2023. 2, 6 [19] Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, et al. Mix-of-show: Decentralized lowrank adaptation for multi-concept customization of diffusion models. In NeurIPS, 2023. 2 [20] Zonghui Guo, Haiyong Zheng, Yufeng Jiang, Zhaorui Gu, and Bing Zheng. Intrinsic image harmonization. In CVPR, 2021. 2 [21] Agrim Gupta, Piotr Dollar, and Ross Girshick. Lvis: A dataset for large vocabulary instance segmentation. In CVPR, 2019. 4 [22] Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, and Jingren Zhou. Composer: Creative and controllable image synthesis with composable conditions. In ICML, 2023. 2 [23] Nick Kanopoulos, Nagesh Vasanthavada, and Robert L Baker. Design of an image edge detection filter using the sobel operator. JSSC, 1988. 3 [24] Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, and Michal Irani. Imagic: Text-based real image editing with diffusion models. In CVPR, 2023. 2 [25] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014. 5 [26] Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. In ICCV, 2023. 2, 3, 4, 8 [27] Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu. Multi-concept customization of text-to-image diffusion. In CVPR, 2023. 2, 6 [28] Dongxu Li, Junnan Li, and Steven CH Hoi. Blip-diffusion: Pre-trained subject representation for controllable text-toimage generation and editing. In NeurIPS, 2023. 2 [29] Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In ICML, 2023. 2 [30] Tianle Li, Max Ku, Cong Wei, and Wenhu Chen. Dreamedit: Subject-driven image editing. arXiv:2306.12624, 2023. 2 [31] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dolla ́r, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014. 5 [32] Qin Liu, Zhenlin Xu, Gedas Bertasius, and Marc Niethammer. Simpleclick: Interactive image segmentation with simple vision transformers. In ICCV, 2023. 3 [33] Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, and Yang Cao. Cones: Concept neurons in diffusion models for customized generation. In ICML, 2023. 2, 6 [34] Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, and Yang Cao. Cones 2: Customizable image synthesis with multiple subjects. In NeurIPS, 2023. 2 [35] Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, and Yi Yang. Large-scale video panoptic segmentation in the wild: A benchmark. In CVPR, 2022. 4 [36] Maxime Oquab, Timoth ́ee Darcet, Th ́eo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision. TMLR, 2024. 3, 4, 7 [37] Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, et al. Unicontrol: A unified diffusion model for controllable visual generation in the wild. In NeurIPS, 2023. 2 [38] Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood Dehghan, Osmar R Zaiane, and Martin Jagersand. U2net: Going deeper with nested u-structure for salient object detection. PR, 2020. 3 [39] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In ICML, 2021. 2, 3, 7 [40] Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125, 2022. 2 [41] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ̈orn Ommer. High-resolution image synthesis with latent diffusion models. In CVPR, 2022. 2, 4, 5, 6, 7, 8 [42] Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In CVPR, 2023. 2, 5, 6 [43] Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. Photorealistic text-to-image diffusion models with deep language understanding. In NeurIPS, 2022. 2 [44] Vishnu Sarukkai, Linden Li, Arden Ma, Christopher Re ́, and Kayvon Fatahalian. Collage diffusion. In WACV, 2024. 3 [45] Jing Shi, Wei Xiong, Zhe Lin, and Hyun Joon Jung. Instantbooth: Personalized text-to-image generation without testtime finetuning. arXiv:2304.03411, 2023. 6 [46] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. 7 [47] Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, and Daniel Aliaga. Objectstitch: Object compositing with diffusion model. In CVPR, 2023. 2, 3, 4, 7 [48] Kalyan Sunkavalli, Micah K Johnson, Wojciech Matusik, and Hanspeter Pfister. Multi-scale image harmonization. In SIGGRAPH, 2010. 2 [49] Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Baocai Yin, and Xiang Ruan. Learning to detect salient objects with image-level supervision. In CVPR, 2017. 4 [50] Tan Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, and Lijuan Wang. Disco: Disentangled control for referring human dance generation in real world. arXiv:2307.00040, 2023. 8 [51] Weiyao Wang, Matt Feiszli, Heng Wang, and Du Tran. Unidentified video objects: A benchmark for dense, openworld segmentation. In ICCV, 2021. 4 [52] Guangxuan Xiao, Tianwei Yin, William T Freeman, Fr ́edo Durand, and Song Han. Fastcomposer: Tuning-free multi-subject image generation with localized attention. arXiv:2305.10431, 2023. 2 [53] Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, and Kun Zhang. Smartbrush: Text and shape guided object inpainting with diffusion model. In CVPR, 2023. 2 [54] Ning Xu, Linjie Yang, Yuchen Fan, Dingcheng Yue, Yuchen Liang, Jianchao Yang, and Thomas Huang. Youtubevos: A large-scale video object segmentation benchmark. arXiv:1809.03327, 2018. 4 [55] Ben Xue, Shenghui Ran, Quan Chen, Rongfei Jia, Binqiang Zhao, and Xing Tang. Dccf: Deep comprehensible color filter learning framework for high-resolution image harmonization. In ECCV, 2022. 2 [56] Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, and Fang Wen. Paint by example: Exemplar-based image editing with diffusion models. In CVPR, 2023. 2, 3, 4, 5, 6, 7 [57] Linjie Yang, Yuchen Fan, and Ning Xu. Video instance segmentation. In ICCV, 2019. 4 [58] Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, and Wei Yang. Ipadapter: Text compatible image prompt adapter for text-toimage diffusion models. arXiv:2308.06721, 2023. 2, 5, 6 [59] Tao Yu, Runseng Feng, Ruoyu Feng, Jinming Liu, Xin Jin, Wenjun Zeng, and Zhibo Chen. Inpaint anything: Segment anything meets image inpainting. arXiv:2304.06790, 2023. 2 [60] Xianggang Yu, Mutian Xu, Yidan Zhang, Haolin Liu, Chongjie Ye, Yushuang Wu, Zizheng Yan, Chenming Zhu, Zhangyang Xiong, Tianyou Liang, et al. Mvimgnet: A largescale dataset of multi-view images. In CVPR, 2023. 4 [61] Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, and Ying Shan. Customnet: Zero-shot object customization with variable-viewpoints in text-to-image diffusion models. arXiv:2310.19784, 2023. 2 [62] Lvmin Zhang and Maneesh Agrawala. Adding conditional control to text-to-image diffusion models. In ICCV, 2023. 2, 4, 7 [63] Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris N Metaxas, and Jian Ren. Sine: Single image editing with textto-image diffusion models. In CVPR, 2023. 2 [64] Na Zheng, Xuemeng Song, Zhaozheng Chen, Linmei Hu, Da Cao, and Liqiang Nie. Virtually trying on new clothing with arbitrary poses. In ACMMM, 2019. 4
相关文章:
[CVPR 2024] AnyDoor: Zero-shot Object-level Image Customization
github.com/ali-vilab/AnyDoor.写在前面: 【论文速读】按照#论文十问#提炼出论文核心知识点,方便相关科研工作者快速掌握论文内容。过程中并不对论文相关内容进行翻译。博主认为翻译难免会损坏论文的原本含义,也鼓励诸位入门级科研人员阅读文…...
(动态规划路径基础 最小路径和)leetcode 64
视频教程 1.初始化dp数组,初始化边界 2、从[1行到n-1行][1列到m-1列]依次赋值 #include<vector> #include<algorithm> #include <iostream>using namespace std; int main() {vector<vector<int>> grid { {1,3,1},{1,5,1},{4,2,1}…...
跨组织环境下 MQTT 桥接架构的评估
论文标题 中文标题: 跨组织环境下 MQTT 桥接架构的评估 英文标题: Evaluation of MQTT Bridge Architectures in a Cross-Organizational Context 作者信息 Keila Lima, Tosin Daniel Oyetoyan, Rogardt Heldal, Wilhelm Hasselbring Western Norway …...
2025年1月22日(网络编程 udp)
系统信息: ubuntu 16.04LTS Raspberry Pi Zero 2W 系统版本: 2024-10-22-raspios-bullseye-armhf Python 版本:Python 3.9.2 已安装 pip3 支持拍摄 1080p 30 (1092*1080), 720p 60 (1280*720), 60/90 (640*480) 已安装 vim 已安装 git 学习…...
基于 STM32 的智能电梯控制系统
1. 引言 随着城市化进程的加速,高层建筑日益增多,电梯作为垂直交通工具的重要性愈发凸显。传统电梯控制系统在运行效率、安全性和智能化程度上已难以满足现代需求。智能电梯控制系统能够实时监测电梯的运行状态、乘客需求,并根据这些信息优化…...
使用 Docker(Podman) 部署 MongoDB 数据库及使用详解
在现代开发环境中,容器化技术(如 Docker 和 Podman)已成为部署和管理应用程序的标准方式。本文将详细介绍如何使用 Podman/Docker 部署 MongoDB 数据库,并确保其他应用程序容器能够通过 Docker 网络成功连接到 MongoDB。我们将逐步…...
npm 和 pip 安装中常见问题总结
安装路径的疑惑:NPM 和 PIP 的安装机制 NPM 安装路径规则: 依赖安装在项目目录下: 当你运行 npm install --save-dev jest,它会在当前目录(例如 F:\)下创建一个 node_modules 文件夹,把 jest 安…...
golang面试题
目录 go版本新增功能 Go 1.11 Go 1.18 Go 1.5 go关键字 : 1. 用于声明的关键字 2. 控制流关键字 3. 包相关关键字 4. 并发相关关键字 5. 异常处理关键字 6. 接口和类型断言关键字 go数据类型: 复合数据类型 引用数据类型 接口类型 GC垃…...
基于UKF-IMM无迹卡尔曼滤波与交互式多模型的轨迹跟踪算法matlab仿真,对比EKF-IMM和UKF
目录 1.程序功能描述 2.测试软件版本以及运行结果展示 3.核心程序 4.本算法原理 5.完整程序 1.程序功能描述 基于UKF-IMM无迹卡尔曼滤波与交互式多模型的轨迹跟踪算法matlab仿真,对比EKF-IMM和UKF。 2.测试软件版本以及运行结果展示 MATLAB2022A版本运行 3.核心程序 .…...
Install Python
目录 1.Install Python 1.安装Python 3 2.在Windows上安装Python 3.在Mac上安装Python 4.在Linux上安装Python 5.运行Python 2.Python解释器 1.CPython 2.IPython 3.PyPy 4.Jython 5.IronPython 6.小结 1.Install Python 因为Python是跨平台的,它可以…...
云计算部署模式全面解析
目录 引言公有云私有云混合云三种部署模式的对比选择建议未来趋势结语 1. 引言 随着云计算技术的快速发展,企业在选择云部署模式时面临着多种选择。本文将深入探讨云计算的三种主要部署模式:公有云、私有云和混合云,帮助读者全面了解它们的特点、优势及适用场景。 © iv…...
tomcat核心组件及原理概述
目录 1. tomcat概述 1.1 概念 1.2 官网地址 2. 基本使用 2.1下载 3. 整体架构 3.1 核心组件 3.2 从web.xml配置和模块对应角度 3.3 如何处理请求 4. 配置JVM参数 5. 附录 1. tomcat概述 1.1 概念 什么是tomcat Tomcat是一个开源、免费、轻量级的Web服务器。 Tomca…...
GIS教程:全国数码商城系统
文章目录 注册高德地图API普通网页中测试地图加载地图添加标记地图配置点标记 Marker添加弹框创建vue项目并添加高德地图创建项目加载高德地图项目首页布局封装axios和配置代理服务器获取城市热门信息获取城市区县信息获取区县商城信息获取指定城市区县的经纬度坐标将地图缩放到…...
Level DB --- table.format
table.format是Level DB中table序列化、反序列化重要的辅助类。它用来定义序列化、反序列化的核心结构体和操作实现。 BlockHandle table.format中的BlockHandle类主要用来记录当前block在总的序列化中的offset位置,以及当前block的size,这里面的Block…...
《编写可读代码的艺术》读书笔记
1. 写在前面 借着春节放假的几天, 读了下《编写可读代码的艺术》这本书, 这本书不是很长,主要关注代码的一些编写细节,比如方法命名,函数命名,语句组织,任务分解等, 旨在让写的代码…...
(9)下:学习与验证 linux 里的 epoll 对象里的 EPOLLIN、 EPOLLHUP 与 EPOLLRDHUP 的不同。小例子的实验
(4)本实验代码的蓝本,是伊圣雨老师里的课本里的代码,略加改动而来的。 以下是 服务器端的代码: 每当收到客户端的报文时,就测试一下对应的 epoll 事件里的事件标志,不读取报文内容,…...
MySQL基础-多表查询
多表查询-多表关系 多表查询-概述 例如执行下行sql语句就会出现笛卡尔积: select *from emp,dept; --消除笛卡尔积 select * from emp,dept where emp.dept_id dept.id; 多表查询-查询分类 多表查询-连接查询-内连接 --内连接演示 --1.查询每一个员工的姓名,及关…...
RK3568 opencv播放视频
文章目录 一、opencv相关视频播放类1. `cv::VideoCapture` 类主要构造方法:主要方法:2. 视频播放基本流程代码示例:3. 获取和设置视频属性4. 结合 FFmpeg 使用5. OpenCV 视频播放的局限性6. 结合 Qt 实现更高级的视频播放总结二、QT中的代码实现一、opencv相关视频播放类 在…...
C++中的类型转换
文章目录 一、概述二、隐式类型转换(Implicit Conversion)三、显式类型转换(Explicit Conversion)四、C 风格类型转换 一、概述 C 提供了多种类型转换(Type Conversion)方式,以便在不同类型的数…...
day7手机拍照装备
对焦对不上:1、光太暗;2、离太近;3、颜色太单一没有区分点 滤镜可以后期P 渐变灰滤镜:均衡色彩,暗的地方亮一些,亮的地方暗一些 中灰滤镜:减少光差 手机支架:最基本70cm即可 手…...
Joplin 插件在Vscode中无法显示图片
1.问题 在vscode里面装好joplin插件之后,无法显示图片内容。 粘贴的图片可以再vscode中显示,无法再joplin客户端显示 2.解决方法 这种情况是因为和vscode自带的MD编辑器的预览模式有冲突,或者没用通过专用方式上传图片。 方法一ÿ…...
ReentrantReadWriteLock源码分析
文章目录 概述一、状态位设计二、读锁三、锁降级机制四、写锁总结 概述 ReentrantReadWriteLock(读写锁)是对于ReentranLock(可重入锁)的一种改进,在可重入锁的基础上,进行了读写分离。适用于读多写少的场景…...
ChatGPT-4o和ChatGPT-4o mini的差异点
在人工智能领域,OpenAI再次引领创新潮流,近日正式发布了其最新模型——ChatGPT-4o及其经济实惠的小型版本ChatGPT-4o Mini。这两款模型虽同属于ChatGPT系列,但在性能、应用场景及成本上展现出显著的差异。本文将通过图文并茂的方式࿰…...
小程序设计和开发:什么是竞品分析,如何进行竞品分析
一、竞品分析的定义 竞品分析是指对竞争对手的产品进行深入研究和比较,以了解市场动态、发现自身产品的优势和不足,并为产品的设计、开发和营销策略提供参考依据。在小程序设计和开发中,竞品分析可以帮助开发者了解同类型小程序的功能、用户体…...
计算机网络之计算机网络的分类
计算机网络可以根据不同的角度进行分类,以下是几种常见的分类方式: 1. 按照规模和范围: 局域网(LAN,Local Area Network):覆盖较小范围(例如一个建筑物或校园)…...
什么是门控循环单元?
一、概念 门控循环单元(Gated Recurrent Unit,GRU)是一种改进的循环神经网络(RNN),由Cho等人在2014年提出。GRU是LSTM的简化版本,通过减少门的数量和简化结构,保留了LSTM的长时间依赖…...
ESP32-c3实现获取土壤湿度(ADC模拟量)
1硬件实物图 2引脚定义 3使用说明 4实例代码 // 定义土壤湿度传感器连接的模拟输入引脚 const int soilMoisturePin 2; // 假设连接到GPIO2void setup() {// 初始化串口通信Serial.begin(115200); }void loop() {// 读取土壤湿度传感器的模拟值int sensorValue analogRead…...
获取snmp oid的小方法1(随手记)
snmpwalk遍历设备的mib # snmpwalk -v <SNMP version> -c <community-id> <IP> . snmpwalk -v 2c -c test 192.168.100.201 .根据获取的值,找到某一个想要的值的oid # SNMPv2-MIB::sysName.0 STRING: test1 [rootzabbix01 fonts]# snmpwalk -v…...
【C++篇】哈希表
目录 一,哈希概念 1.1,直接定址法 1.2,哈希冲突 1.3,负载因子 二,哈希函数 2.1,除法散列法 /除留余数法 2.2,乘法散列法 2.3,全域散列法 三,处理哈希冲突 3.1&…...
Nginx开发01:基础配置
一、下载和启动 1.下载、使用命令行启动:Web开发:web服务器-Nginx的基础介绍(含AI文稿)_nginx作为web服务器,可以承担哪些基本任务-CSDN博客 注意:我配置的端口是81 2.测试连接是否正常 访问Welcome to nginx! 如果…...
mysqldump+-binlog增量备份
注意:二进制文件删除必须使用help purge 不可用rm -f 会崩 一、概念 增量备份:仅备份上次备份以后变化的数据 差异备份:仅备份上次完全备份以后变化的数据 完全备份:顾名思义,将数据完全备份 其中,…...
hive:数据导入,数据导出,加载数据到Hive,复制表结构
hive不建议用insert,因为Hive是建立在Hadoop之上的数据仓库工具,主要用于批处理和大数据分析,而不是为OLTP(在线事务处理)操作设计的。INSERT操作会非常慢 数据导入 命令行界面:建一个文件 查询数据>>复制>>粘贴到新…...
【工欲善其事】利用 DeepSeek 实现复杂 Git 操作:从原项目剥离出子版本树并同步到新的代码库中
文章目录 利用 DeepSeek 实现复杂 Git 操作1 背景介绍2 需求描述3 思路分析4 实现过程4.1 第一次需求确认4.2 第二次需求确认4.3 第三次需求确认4.4 V3 模型:中间结果的处理4.5 方案验证,首战告捷 5 总结复盘 利用 DeepSeek 实现复杂 Git 操作 1 背景介绍…...
mac 手工安装OpenSSL 3.4.0
如果你希望继续安装 openssl-3.4.0 而不是降级到 3.1.1,可以尝试以下解决方案。根据你提供的错误信息,问题可能出在测试阶段(make test),我们可以尝试跳过测试或修复测试失败的原因。 --- ### **解决方案:…...
构建一个数据分析Agent:提升分析效率的实践
在上一篇文章中,我们讨论了如何构建一个智能客服Agent。今天,我想分享另一个实际项目:如何构建一个数据分析Agent。这个项目源于我们一个金融客户的真实需求 - 提升数据分析效率,加快决策速度。 从分析师的痛点说起 记得和分析师团队交流时的场景: 小张ÿ…...
【SRC排名】安全应急响应中心SRC上榜记录
2023年 新氧第三 https://security.soyoung.com/top 合合第四 https://security.intsig.com/index.php?m&chall&aindex 2024年 好未来第一 https://src.100tal.com/index.php?m&chall&aindex(官网是总榜,年榜只有海报)…...
截止到2025年2月1日,Linux的Wayland还有哪些问题是需要解决的?
截至2025年2月1日,Wayland需要解决的核心问题可按权重从高到低排序如下: 1. 屏幕共享与远程桌面的完整支持(权重:★★★★★) 问题:企业场景(如 腾讯会议)、开发者远程调试依赖稳定的屏幕共享功能。当前Wayland依赖PipeWire和XWayland,存在权限管理复杂、多显示器选择…...
TCP编程
1.socket函数 int socket(int domain, int type, int protocol); 头文件:include<sys/types.h>,include<sys/socket.h> 参数 int domain AF_INET: IPv4 Internet protocols AF_INET6: IPv6 Internet protocols AF_UNIX, AF_LOCAL : Local…...
Java泛型深度解析(JDK23)
第一章 泛型革命 1.1 类型安全的进化史 前泛型时代的类型转换隐患 代码的血泪史(Java 1.4版示例): List rawList new ArrayList(); rawList.add("Java"); rawList.add(Integer.valueOf(42)); // 编译通过// 灾难在运行时爆发…...
【JavaEE进阶】图书管理系统 - 壹
目录 🌲序言 🌴前端代码的引入 🎋约定前后端交互接口 🚩接口定义 🍃后端服务器代码实现 🚩登录接口 🚩图书列表接口 🎄前端代码实现 🚩登录页面 🚩…...
搜索旋转排序数组(二分查找)
测试链接:https://leetcode.cn/problems/search-in-rotated-sorted-array/https://leetcode.cn/problems/search-in-rotated-sorted-array/https://leetcode.cn/problems/search-in-rotated-sorted-array/ 问题描述 假设我们有一个旋转排序的数组,这个…...
STM32 TIM定时器配置
TIM简介 TIM(Timer)定时器 定时器可以对输入的时钟进行计数,并在计数值达到设定值时触发中断 16位计数器、预分频器、自动重装寄存器的时基单元,在72MHz计数时钟下可以实现最大59.65s的定时 不仅具备基本的定时中断功能ÿ…...
AI开发之 ——Anaconda 介绍
Anaconda 是什么? 在这里插入图片描述 一句话:Anaconda 是Python 库和环境便捷管理的平台。 Anaconda 是数据科学和 AI 领域的工具,通过集成常用库和工具,简化了环境管理和包安装,特别适合初学者和需要快速上手的开…...
Uber损失(Huber Loss):从均方误差到绝对误差的完美过渡
前言 在机器学习的世界里,损失函数就像是你在迷宫中的导航系统,它决定了你到底能否顺利找到出口,而出口的大小就代表着模型的表现。而在这么多的“导航系统”中,Huber损失(你可以叫它“Uber损失”,我觉得这名字挺有意思的,能不能打车到一个更好的模型呢?)凭借其独特的…...
【Arxiv 大模型最新进展】TOOLGEN:探索Agent工具调用新范式
【Arxiv 大模型最新进展】TOOLGEN:探索Agent工具调用新范式 文章目录 【Arxiv 大模型最新进展】TOOLGEN:探索Agent工具调用新范式研究框图方法详解 作者:Renxi Wang, Xudong Han 等 单位:LibrAI, Mohamed bin Zayed University o…...
41【文件名的编码规则】
我们在学习的过程中,写出数据或读取数据时需要考虑编码类型 火山采用:UTF-16 易语言采用:GBK php采用:UTF-8 那么我们写出的文件名应该是何种编码的?比如火山程序向本地写出一个“测试.txt”,理论上这个“测…...
Linux命令入门
Linux命令入门 ls命令 ls命令的作用是列出目录下的内容,语法细节如下: 1s[-a -l -h] [Linux路径] -a -l -h是可选的选项 Linux路径是此命令可选的参数 当不使用选项和参数,直接使用ls命令本体,表示:以平铺形式,列出当前工作目录下的内容 ls命令的选项 -a -a选项&a…...
如何用函数去计算x年x月x日是(C#)
如何用函数去计算x年x月x日是? 由于现在人工智能的普及,我们往往会用计算机去算,或者去记录事情 1.计算某一年某一个月有多少天 2.计算某年某月某日是周几 using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threadin…...
29.Word:公司本财年的年度报告【13】
目录 NO1.2.3.4 NO5.6.7 NO8.9.10 NO1.2.3.4 另存为F12:考生文件夹:Word.docx选中绿色标记的标题文本→样式对话框→单击右键→点击样式对话框→单击右键→修改→所有脚本→颜色/字体/名称→边框:0.5磅、黑色、单线条:点…...
Flutter常用Widget小部件
小部件Widget是一个类,按照继承方式,分为无状态的StatelessWidget和有状态的StatefulWidget。 这里先创建一个简单的无状态的Text小部件。 Text文本Widget 文件:lib/app/app.dart。 import package:flutter/material.dart;class App exte…...