通过数据集管理工程数据#

In VisionFlow, data is divided into two types: Parameter and Property. This section introduces multiple methods and examples of getting or updating property data.

All the property data is stored in database with some implementation details and organized as SampleSet to provide high-level interfaces for better usablility.

Access Sample by SampleSet#

SampleSet is a map which takes samples as value and samples’ unique ids as key. The types of the properties in the SampleSet is determined by the Project(StreamGraph). Sample is a map which takes data of the property as value and the unique property node id as key.

Note

SampleSet is a database handler which will close the database when destructing.

As mentioned above, the SampleSet is determined by Project, and more specifically, by the inferface of Project listed below:

visionflow::Project::add_tool() visionflow::Project::remove_tool() visionflow::Project::copy_tool() visionflow::Project::copy_tool_group()

the properties of the SampleSet may change each time after you calling those methods.

For example, adding a tool to Project will also add the properties in this tool into SampleSet as a placeholder.

LazySample and Sample#

Both LazySample and Sample can access property data. The different is that the LazySample is connected with the database and will read from database when you ask to, while the Sample is not.

The modification of LazySample and Sample will not affect the database unless you commit those modification by visionflow::data::SampleSet::update().

LazySample#

LazySample will access data through the database for the first time, and records the visit or modification of the data. So the second time you ask for some data, LazySample will find it in its catch so the access will be fast.

Besides, when you update a LazySample to SampleSet, LazySample will report the modification of the data and only the different data will be updated into database. This will also improve the performance.

For more information see visionflow::LazySample.

Sample#

Sample is just a container that stores the offline property data. The data it holds is the data you set. So the data you get from Sample may be out of date. Always use LazySample because it is more effective. You can get a Sample object by visionflow::LazySample::fetch(), which will store the data from LazySample at that time.

When you update SampleSet with a Sample object, all the properties in Sample will be updated to database even though some of them are not changed.

For more information see visionflow::Sample.

SampleSetIterator#

The wrapped iterator class for SampleSet to access all samples in SampleSet iteratively. You can use it just like std::map::iterator.

Usage:

// get SampleSet from Project
visionflow::SampleSet sample_set = project->main_sample_set();

// print information of samples iteratively
for(visionflow::SampleSetIterator iter = sample_set.begin(); iter != sample_set.end(); ++iter){
    std::cout << "sample id = " << iter.key() << std::endl;
    std::cout << "sample created time = " << iter.descriptor().created_time << std::endl;
}

// find a sample with specific id
uint32_t id = 1;
auto iter = sample_set.find(id);
.. TODO
.. TODO

For more information, see interface documentation: visionflow::data::SampleSetIterator

SampleDescriptor#

A data structure that contains the created time and tags of the sample.

Get the SampleDescriptor of a specified id by: visionflow::data::ReadOnlySampleSet::sample_descriptor()

Update SampleDescriptor#

You may need to update SampleDescriptor information for some samples, use:

void visionflow::data::SampleSet::update(uint32_t id, const SampleDescriptor &data) or

void visionflow::data::SampleSet::update(const SampleSetIterator &iter, const ISample &data)

Below is an example shows how to add a new tag to the descriptor of the first sample:

// get the SampleSet from Project
visionflow::SampleSet sample_set = project->main_sample_set();

//get the SampleSetIterator of the sample to be updated
auto iter = sample_set.begin();

//get the old SampleDescriptor
visionflow::SampleDescriptor desp = iter.descriptor();

// add a new_tag to descriptor
desp.tags.add("new_tag");

// update to SampleSet by SampleSetIterator
sample_set.update(iter, desp);

// you may also update by id of the sample
// sample_set.update(iter.key(), desp);
.. TODO
.. TODO

For more information about SampleDescriptor, see visionflow::SampleDescriptor.

Add Sample#

Add an empty sample to SampleSet and get its unique id by:

visionflow::data::SampleSet::add_empty_sample()

then an empty template sample will be added in SampleSet.

You can also add an existing sample into SampleSet and get its unique id by:

visionflow::data::SampleSet::add()

In order to add the sample successfully, you need to make sure that the sample and the SampleSet have the same property ids and types. Otherwise visionflow::excepts::PropertyTypeMismatch will be thrown.

Assuming that we have an Input tool(see 工具清单及详细流程图 for more details about Tools), the following code shows how to add a sample which contains properties in Input tool into SampleSet:

// add Input tool into Project
project->add_tool("Input");

// get the SampleSet from Project
visionflow::SampleSet sample_set = project->main_sample_set();

//create an empty sample
auto sample = sample_set.create_empty_sample();

// load an image from file
visionflow::props::Image img;
img.set_image(visionflow::img::Image::FromFile("D:/path/to/img_1.jpg"));

//set the Image property data
sample.set({"Input/image"},
           std::make_shared<visionflow::props::Image>(img));

// add four samples into SampleSet
sample_set.add(sample); // sample id == 1
sample_set.add(sample); // sample id == 2
sample_set.add(sample); // sample id == 3
sample_set.add(sample); // sample id == 4
.. TODO
.. TODO

Update Sample#

You can update an existed sample by sample id or iterator: void visionflow::data::SampleSet::update(uint32_t id, const ISample &data)

Warning

Sample id should exist in SampleSet and the property types should equal to SampleSet, otherwise visionflow::excepts::SampleNotFound or visionflow::excepts::PropertyTypeMismatch will be thrown.

void visionflow::data::SampleSet::update(const SampleSetIterator &iter, const ISample &data)

Warning

Sample iterator should point to an exist sample in SampleSet and the property types should equal to SampleSet, otherwise visionflow::excepts::SampleNotFound or visionflow::excepts::PropertyTypeMismatch will be thrown.

The following code shows how to use update interface to update a new image property. The code is continued with the previous code block:

// already add Input tool into Project

// add new empty sample into SampleSet and get its id
LazySample sample = sample_set.at(1);

// load new image from file
visionflow::props::Image img;
img.set_image(visionflow::img::Image::FromFile("D:/path/to/new_img.jpg"));

//set the Image property data
sample.set({"Input/image"},
           std::make_shared<visionflow::props::Image>(img));

// update sample_set with LazySample
sample_set.update(new_id, sample);

// you may also update by iterator
// sample_set.update(sample_set.find(new_id), sample);
.. TODO
.. TODO

Erase sample#

Erase an existed sample by sample id:

void visionflow::data::SampleSet::erase(uint32_t id);

Warning

Sample id should exist in SampleSet, otherwise visionflow::excepts::SampleNotFound will be thrown.

Erase an existed sample by sample iterator:

SampleSetIterator visionflow::data::SampleSet::erase(const SampleSetIterator &iter)

Warning

Sample iterator should point to an exist sample in SampleSet, otherwise visionflow::excepts::InvalidIterator will be thrown.

Access Property by PropertySet#

PropertySet is data structure to manage property data of one single property node with the same property type for all samples in SampleSet.

PropertySet also recode the last update time of this property node for all samples.

Note

Like SampleSet, PropertySet also maintains a database handler but it will not close the database when destructing. So you need to make sure that all PropertySet object is destroyed before SampleSet object.

PropertySetIterator#

Like SampleSet, PropertySet also provide iterator class: visionflow::data::PropertySetIterator.

You can access property of all sample in SampleSet iteratively like below:

// get the PropertySet of Image
visionflow::data::PropertySet property_set = sample_set.property_set({"Input", "image"});

for (auto iter = property_set.begin(); iter.valid(); ++iter) {
    std::cout << "sample id :" << iter.key() << std::endl;

    // convert to props::Image ptr
    auto img = iter.value()->as<visionflow::props::Image>();
    // use img property to do some work ...
}
.. TODO
.. TODO

Access, update, erase Property#

Warning

sample id should exist in SampleSet and the property types should equal to SampleSet, otherwise visionflow::excepts::SampleNotFound will be thrown.

PropertySetIterator object should point to an exist sample in SampleSet and the property types should equal to SampleSet, otherwise visionflow::excepts::InvalidIterator will be thrown.

You can also access, update and erase property by sample id or iterator:

// get the PropertySet of Image
visionflow::data::PropertySet property_set = sample_set.property_set({"Input", "image"});

// get last update time
std::cout << "last time:" << property_set.last_update_time();

// get property type
assert(property_set.property_type() == "visionflow::props::Image");

// check sample not empty
assert(property_set.sample_empty() == false);
assert(property_set.data_exists(1) == true);

// get image property by sample id
auto img_of_sample_1 = property_set.at(1);

// reset a new image to the image property
img_of_sample_1->set_image(visionflow::img::Image::FromFile("D:/path/to/new_img.jpg"));

// update PropertySet by sample id
property_set.update(1, *img_of_sample_1);

// erase PropertySet by iterator
const auto&iter = property_set.find(4);
property_set.erase(iter);

// now the image property in sample 4 will be erased
.. TODO
.. TODO

WriteBatch for better performance#

Update database frequently costs a lot because each modification of database will lead to disk I/O. We provide WriteBatch for better performance which catches all your modification in memory and then update them to database to save to the hard disk at once.

PropertyWriteBatch#

PropertyWriteBatch catches all modification operators for PropertySet.

Note

For the same sample, only the last modification will be counted.

Warning

You need to make sure that the sample to be modified is exist in SampleSet, otherwise visionflow::excepts::SampleNotFound will be thrown when updating.

Here is an example shows how to use PropertyWriteBatch:

// get the PropertySet of Image
visionflow::data::PropertySet property_set = sample_set.property_set({"Input", "image"});

// firstly, create a PropertyWriteBatch
PropertyWriteBatch prop_write_batch = property_set.create_write_batch();

// catch all your modification on image PropertySet for different samples

// update sample 1 with img_101
visionflow::props::Image img_101;
img_101.set_image(visionflow::img::Image::FromFile("D:/path/to/img_101.jpg"));
prop_write_batch.update(1, img_101);

// update sample 2 with img_102
visionflow::props::Image img_102;
img_102.set_image(visionflow::img::Image::FromFile("D:/path/to/img_102.jpg"));
prop_write_batch.update(2, img_102);

// erase image property in sample 3
prop_write_batch.erase(3);

// erase image property in sample 2
// this will overwrite previous update operator
prop_write_batch.erase(2);

// update to database at once
property_set.write_batch(prop_write_batch);

// now all the modification are update to database
// image property in:
//          sample 1 will update to img_101;
//          sample 2 and sample 3 will be erased;
.. TODO
.. TODO

SampleWriteBatch#

SampleWriteBatch catches all modification operators for SampleSet, such as modification of sample, sample descriptor or single update of property in a sample.

Note

For the same sample, only the last modification will be counted.

Warning

You need to make sure that the sample to be modified is exist in SampleSet, otherwise visionflow::excepts::SampleNotFound will be thrown when updating.

Here is an example shows how to use SampleWriteBatch:

// firstly, create a SampleWriteBatch
SampleWriteBatch sample_write_batch = sample_set.create_write_batch();

const visionflow::ToolNodeId img_prop_id = {"Input/image"};

// catch all your modification to SampleWriteBatch

// update sample 1 with img_101
visionflow::props::Image img_101;
img_101.set_image(visionflow::img::Image::FromFile("D:/path/to/img_101.jpg"));
sample_write_batch.update(
    1, img_prop_id, std::make_shared<visionflow::props::Image>(img_101));

// update sample 2 with img_102
visionflow::props::Image img_102;
img_102.set_image(visionflow::img::Image::FromFile("D:/path/to/img_102.jpg"));
sample_write_batch.update(
    2, img_prop_id, std::make_shared<visionflow::props::Image>(img_102));

// erase in sample 3
sample_write_batch.erase(3);

// erase image property in sample 2
// this will overwrite previous update operator
sample_write_batch.erase(2, img_prop_id);

// update to database at once
sample_set.write_batch(sample_write_batch);

// now all the modification are update to database
// image property in:
//          sample 1 will update to img_101;
//          sample 2 will be erased;
// and the whole sample 3 will be erased from SampleSet;
.. TODO
.. TODO

Data filter#

Both SampleSet and PropertySet provide filter function to filter out the data you want. The filter criteria is a string of python script.

Note

To write a Python script, you need to know:
  1. The indentation in Python is very important and need you to pay more attention to.

  2. The script should be a function contains the implementation of filtering logic and return a Boolean as fitler result.

  3. The function must be named as “vflow_filter” and take a Sample(ReadOnlySampleSetView) or Property(ReadOnlyPropertySetView) as parameter.

  4. Import the appropriate visionflow modules in Python. The Python module is similar as namespace in C++.

Warning

The Python script must follow the syntax of Python, otherwise visionflow::excepts::PythonSyntaxError will be thrown.

ReadOnlySampleSetView#

The following code shows how to filter out the samples whose image size is between 1024*1024 and 2048*2048:

// firstly, write our filter script
const std::string filter_script = R"(
from visionflow import *
from visionflow.props import *
from visionflow.img import *
def vflow_filter(sample):
  img_prop_id = ToolNodeId("Input", "image")

  if not(sample.exist_property_data(img_prop_id)):
      return False
  img = sample.get(img_prop_id).image();
  return  1024 <= img.size().w <= 2048 and 1024 <= img.size().h <= 2048 )";

auto sample_view = sample_set.filter(filter_script);

// if image in sample 1 meet the filter requirement, then its id will exist
// assert(sample_filter.exists(1));

// so to other samples
// assert(sample_filter.exists(2));
assert(sample_view.ids() == std::vector<uint32_t>{1, 2});
.. TODO
.. TODO

For more details see visionflow::data::ReadOnlySampleSetView.

ReadOnlyPropertySetView#

ReadOnlyPropertySetView provide both directly filter out samples by id and python script filter for properties.

Set filter ids to exclude samples you do not want. Set Python script to find out the properties that meet your requirement.

The Python script for ReadOnlyPropertySetView is consistent with script for ReadOnlySampleSetView, except that ReadOnlyPropertySetView takes a Property as parameter while ReadOnlySampleSetView takes Sample(Script Helper).

The following code shows how to filter out the samples whose image size is between 1024*1024 and 2048*2048 and exclude the sample 1:

// firstly,
const visionflow::ToolNodeId img_prop_id = {"Input/image"};

// secondly, write our filter script
const std::string filter_script = R"(
from visionflow import *
from visionflow.props import *
from visionflow.img import *
def vflow_filter(prop):
  img = prop.image();
  return  1024 <= img.size().w <= 2048 and 1024 <= img.size().h <= 2048 )";

PropertySet img_prop_set = sample_set.property_set(img_prop_id);

ReadOnlyPropertySetView img_prop_view = img_prop_set.fitler(filter_script);

// if image in sample 1 meet the filter requirement of the python script, then its id will exist
assert(img_prop_view.sample_exists(1));
assert(img_prop_view.data_exists(1));

 // if image in sample 2 not meet the filter requirement of the python script, then it will be filtered out
assert(img_prop_view.sample_exists(2));
assert(img_prop_view.data_exists(2) == false);

// all sample with filter ids will be filtered out
// no matter they meet the filter requirement of the python script or not
img_prop_view.set_filter_ids({1});
assert(img_prop_view.sample_exists(1) == false);
assert(img_prop_view.data_exists(1) == false);
.. TODO
.. TODO

For more details see visionflow::data::ReadOnlyPropertySetView.