VATIC - 来自加利福尼亚州欧文的视频注释工具
VATIC是一个用于计算机视觉研究的在线视频注释工具，它将工作众包给亚马逊的Mechanical Turk。我们的工具可以轻松构建大规模，经济实惠的视频数据集。Vatic适用于图像检测任务的数据集制作,它还可以做视频的标注，比如一个25fps的视频，只需要隔100帧左右手动标注一下物体的位置，最后在整个视频中就能有比较好的效果。这依赖于软件集成的opencv的追踪算法。可适用于数据众包市场有效扩大视频数据标注工作。

安装

注意：VATIC仅在Ubuntu上使用Apache 2.2 HTTP服务器和MySQL服务器进行了测试。本文档将介绍此平台上的安装，但它应适用于任何操作系统和任何服务器。

织梦内容管理系统

内容来自dedecms 下载

您可以从我们的网站下载并提取VATIC。注意：请勿以root用户身份运行安装程序。 dedecms.com

$ wget http://mit.edu/vondrick/vatic/vatic-install.sh $ chmod +x vatic-install.sh $ ./vatic-install.sh $ cd vatic

织梦好，好织梦

织梦好，好织梦
HTTP服务器配置

打开Apache配置文件。在Ubuntu上，此文件位于：内容来自dedecms

/etc/apache2/sites-enabled/000-default

本文来自织梦

如果您不在此计算机上使用Apache用于任何其他目的，请将文件内容替换为：内容来自dedecms

WSGIDaemonProcess www-data
WSGIProcessGroup www-data

<VirtualHost *:80>
    ServerName vatic.domain.edu
    DocumentRoot /path/to/vatic/public

    WSGIScriptAlias /server /path/to/vatic/server.py
    CustomLog /var/log/apache2/access.log combined
</VirtualHost>
 dedecms.com

使用您的域名更新ServerName，使用VATIC中公共目录的路径更新DocumentRoot，将WSGIScriptAlias更新为VATIC的server.py文件。 dedecms.com

如果您确实将Apache用于其他目的，则必须使用正确的文档根和脚本别名设置新的虚拟主机，如上所示。 dedecms.com

确保已启用mod_headers模块：

织梦内容管理系统

$ sudo cp /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled

dedecms.com

进行这些更改后，重新启动Apache：

织梦内容管理系统

$ sudo apache2ctl graceful
 织梦内容管理系统

dedecms.com
SQL Server配置

我们建议专门为VATIC创建一个单独的数据库：

织梦好，好织梦

$ mysql -u root
mysql> create database vatic;
 本文来自织梦

下一节将自动创建必要的表。 copyright dedecms

dedecms.com
建立

在vatic目录中，将config.py-example复制到config.py：

本文来自织梦

$ cp config.py-example config.py

dedecms.com

然后打开config.py并更改以下变量以配置VATIC：

内容来自dedecms

signature       Amazon Mechanical Turk AWS signature (secret access key)
accesskey       Amazon Mechanical Turk AWS access key (access key ID)
sandbox         If true, put into Mturk sandbox mode. For debugging.
localhost       The local HTTP address: http://vatic.domain.edu/ so it
                matches the ServerName in Apache.
database        Database connection string: for example,
                mysql://user:pass@localhost/vatic
geolocation     API key from ipinfodb.com for geolocation services
 本文来自织梦

如果您不打算在Mechcanical Turk上使用VATIC（仅限offlien模式），则可以将签名和访问密钥保留为空。

织梦好，好织梦

保存结果后，您可以初始化数据库：织梦内容管理系统

$ turkic setup --database

织梦好，好织梦

注意：如果要重置数据库，可以使用以下命令：本文来自织梦

$ turkic setup --database --reset
 织梦好，好织梦

这将需要确认重置，以防止数据丢失。本文来自织梦

最后，您还必须允许VATIC访问turkic，这是一个主要的依赖： copyright dedecms

$ turkic setup --public-symlink

本文来自织梦

本文来自织梦
注解

在继续之前，您应该验证安装是否正确。您可以通过以下方式验证：

$ turkic status --verify
 本文来自织梦

如果您收到任何错误消息，则表示安装未完成，您应该查看上一节。注意：如果您不打算使用Mechanical Turk，则可以放心地忽略Mechanical Turk引起的任何错误。

本文来自织梦

本文来自织梦
帧提取

我们的系统要求将视频提取到JPEG帧中。我们的工具可以为您自动执行此操作： dedecms.com

$ mkdir /path/to/output/directory
$ turkic extract /path/to/video.mp4 /path/to/output/directory
 织梦内容管理系统

默认情况下，我们的工具会调整帧的大小以适应720x480矩形。我们相信此分辨率非常适合在线视频观看。您可以使用选项更改分辨率：本文来自织梦

$ turkic extract /path/to/video.mp4 /path/to/output/directory
  --width 1000 --height 1000
 内容来自dedecms

要么 copyright dedecms

$ turkic extract /path/to/video.mp4 /path/to/output/directory --no-resize

织梦好，好织梦

该工具将在所有情况下保持纵横比。内容来自dedecms

或者，如果您已经提取了帧，则可以使用formatframes命令将视频格式化为VATIC可以理解的格式：织梦内容管理系统

$ turkic formatframes /path/to/frames/ /path/to/output/directory

内容来自dedecms

上面的命令将读取/ path / to / frames中的所有图像，并在/ path / to / output /目录中创建硬链接（软拷贝）。

dedecms.com

织梦好，好织梦
导入视频

提取帧后，可以将视频导入我们的工具中进行注释。此操作的一般语法是：

内容来自dedecms

$ turkic load identifier /path/to/output/directory Label1 Label2 LabelN

织梦内容管理系统

其中identifier是用于引用此视频的唯一字符串，/ path / to / output / directory是帧的目录，LabelX是您要注释的类标签（例如，Person，Car，Bicycle）。您可以拥有任意数量的类标签，但必须至少有一个。

织梦好，好织梦

导入视频时，视频会被分成几小段，通常只有几秒钟。当所有段都被注释时，注释将跨段合并，因为每个段与另一个段重叠一小段。

dedecms.com

上面的命令指定了所有必需的选项，但也有许多选项可用。我们建议使用这些选项。

内容来自dedecms

MTurk Options --title The title that MTurk workers see --description The description that MTurk workers see --duration Time in seconds that a worker has to complete the task --lifetime Time in seconds that the task is online --keywords Keywords that MTurk workers can search on --offline Disable MTurk and use for self annotation only Compensation Options --cost The price advertised to MTurk workers --per-object-bonus A bonus in dollars paid for each object --completion-bonus A bonus in dollars paid for completing the task Qualification Options --min-approved-percent Minimum percent of tasks the worker must have approved before they can work for you --min-approved-amount Minimum number of tasks that the worker must have completed before they can work for you Video Options --length The length of each segment for this video in frames --overlap The overlap between segments in frames --use-frames When splitting into segments, only the frame intervals specified in this file. Each line should contain a start frame, followed by a space, then the stop frame. Frames outside the intervals in this file will be ignored. --skip If specified, request annotations only every N frames. --blow-radius When a user marks an annotation, blow away all other annotations within this many frames. If you want to allow the user to make fine-grained annotations, set this number to a small integer, or 0 to disable. By default, this is 5, which we recommend.

内容来自dedecms

您还可以指定每个对象标签可以采用的时间属性。例如，您可能有一个具有“行走”，“跑步”或“坐着”属性的人物对象。您可以使用与标签相同的方式指定属性，除了在文本之前添加〜，将属性绑定到上一个标签：织梦好，好织梦

$ turkic load identifier /path/to/output/directory Label1 ~Attr1A ~Attr1B
  Label2 ~Attr2A ~Attr2B ~Attr2C Label3 
 织梦好，好织梦

在上面的示例中，Label1将具有属性Attr1A和Attr1B，Label2将具有属性Attr2B，Attr2B和Attr2C，而Label3将没有属性。指定属性是可选的。

本文来自织梦
黄金标准培训

事实证明，视频注释极具挑战性，大多数MTurk工作人员缺乏必要的耐心。出于这个原因，我们建议要求工作人员传递“黄金标准”视频。当新工作人员访问该任务时，他们将被重定向到已经注释了注释的视频。为了继续使用真正的注释，工作人员必须首先正确地注释黄金标准视频。我们发现这种方法显着提高了注释的质量。本文来自织梦

要使用此功能，请导入要用作黄金标准的视频： dedecms.com

$ turkic load identifier-train /path/to/frames Label1 Label2 LabelN
  --for-training --for-training-start 0 --for-training-stop 500
  --for-training-overlap 0.5 --for-training-tolerance 0.1
  --for-training-mistakes 1
 copyright dedecms

您还可以使用上述任何选项。新选项的解释如下：

内容来自dedecms

--for-training              Specifies that this video is gold standard
--for-training-start        Specifies the first frame to use
--for-training-stop         Specifies the last frame to use
--for-training-overlap      Percent overlap that worker's boxes must match 
--for-training-tolerance    Percent that annotations must agree temporally
--for-training-mistakes     The number of completely wrong annotations 
                            allowed. We recommend setting this to a small,
                            nonzero integer.
 织梦内容管理系统

运行上述命令后，它将为您提供一个URL，供您输入地面实况注释。你必须尽可能小心地对这个基本事实进行注释，因为它将用于评估未来的工人。

织梦好，好织梦

您现在可以指定视频应使用黄金标准视频： dedecms.com

$ turkic load identifier /path/to/output/directory Label1 Label2 LabelN
  --train-with identifier-train
 织梦内容管理系统

当一位尚未见过的工作人员访问此视频时，他们现在将被重定向到培训视频，并被要求首先通过评估测试。内容来自dedecms

内容来自dedecms
发布任务

当您准备好让MTurk工作人员进行注释时，您必须发布任务，这将允许工作人员开始注释：内容来自dedecms

$ turkic publish
 织梦内容管理系统

您可以限制已发布的任务数：本文来自织梦

$ turkic publish --limit 100

织梦好，好织梦

重复运行以上命令将以100个批次启动任务。您还可以禁用所有挂起任务：织梦内容管理系统

$ turkic publish --disable
 dedecms.com

这将“取消发布”尚未完成的任务。织梦好，好织梦

如果您的视频仅处于离线状态，则可以使用以下命令查看其访问网址：

dedecms.com

$ turkic publish --offline
 本文来自织梦

注意：要使上述命令起作用，您必须使用--offline参数加载视频： dedecms.com

$ turkic load identifier /path/to/frames Person --offline
 织梦好，好织梦

织梦好，好织梦检查状态

您可以使用以下命令检查视频注释服务器的状态：本文来自织梦

$ turkic status
 织梦内容管理系统

这将列出有关服务器的各种统计信息，例如已发布的作业数和已完成的作业数。通过从Amazon请求其他信息，您可以获得更多统计信息：本文来自织梦

$ turkic status --turk
 copyright dedecms

这将输出您的帐户中剩余的金额，以及其他统计数据。内容来自dedecms

当所有视频都被注释时，最后一行将显示为： dedecms.com

Server is offline.
 本文来自织梦

织梦好，好织梦检索注释

您可以使用以下命令获取视频的所有注释： dedecms.com

$ turkic dump identifier -o output.txt
 织梦内容管理系统

这将写入文件“output.txt”，其中每行包含一个注释。每行包含10列以空格分隔。这些列的定义是：本文来自织梦

1 Track ID. All rows with the same ID belong to the same path. 2 xmin. The top left x-coordinate of the bounding box. 3 ymin. The top left y-coordinate of the bounding box. 4 xmax. The bottom right x-coordinate of the bounding box. 5 ymax. The bottom right y-coordinate of the bounding box. 6 frame. The frame that this annotation represents. 7 lost. If 1, the annotation is outside of the view screen. 8 occluded. If 1, the annotation is occluded. 9 generated. If 1, the annotation was automatically interpolated. 10 label. The label for this annotation, enclosed in quotation marks. 11+ attributes. Each column after this is an attribute.