Signature | Description | Parameters |
---|---|---|
template<typename T, typename F, typename ... Ts> DataFrame get_data_by_sel(const char *name, F &sel_functor) const; |
This method does Boolean filtering selection via the sel_functor (e.g. a functor, function, or lambda). It returns a new DataFrame. Each element of the named column along with its corresponding index is passed to the sel_functor. If sel_functor returns true, that index is selected and all the elements of all column for that index will be included in the returned DataFrame. The signature of sel_fucntor: bool ()(const IndexType &, const T &)NOTE If the selection logic results in empty column(s), the result empty columns will _not_ be padded with NaN's. You can always call make_consistent() on the original or result DataFrame to make all columns into consistent length |
T: Type of the named column F: Type of the selecting functor Ts: The list of types for all columns. A type should be specified only once name: Name of the data column sel_functor: A reference to the selecting functor |
template<typename T, typename F, typename ... Ts> DataFramePtrView<I> get_view_by_sel(const char *name, F &sel_functor) const; |
This is identical with above get_data_by_sel(), but:
NOTE: Although this is a const method, it returns a view. So, the data could still be modified through the returned view |
T: Type of the named column F: Type of the selecting functor Ts: The list of types for all columns. A type should be specified only once name: Name of the data column sel_functor: A reference to the selecting functor |
template<typename T1, typename T2, typename F, typename ... Ts> DataFrame get_data_by_sel(const char *name1, const char *name2, F &sel_functor) const; |
This does the same function as above get_data_be_sel() but operating on two columns. The signature of sel_fucntor: bool ()(const IndexType &, const T1 &, const T2 &) |
T1: Type of the first named column T2: Type of the second named column F: Type of the selecting functor Ts: The list of types for all columns. A type should be specified only once name1: Name of the first data column name2: Name of the second data column sel_functor: A reference to the selecting functor |
template<typename T1, typename T2, typename F, typename ... Ts> DataFramePtrView<I> get_view_by_sel(const char *name1, const char *name2, F &sel_functor) const; |
This is identical with above get_data_by_sel(), but:
NOTE: Although this is a const method, it returns a view. So, the data could still be modified through the returned view |
T1: Type of the first named column T2: Type of the second named column F: Type of the selecting functor Ts: The list of types for all columns. A type should be specified only once name1: Name of the first data column name2: Name of the second data column sel_functor: A reference to the selecting functor |
template<typename T1, typename T2, typename T3, typename F, typename ... Ts> DataFrame get_data_by_sel(const char *name1, const char *name2, const char *name3, F &sel_functor) const; |
This does the same function as above get_data_be_sel() but operating on three columns. The signature of sel_fucntor: bool ()(const IndexType &, const T1 &, const T2 &, const T3 &) |
T1: Type of the first named column T2: Type of the second named column T3: Type of the third named column F: Type of the selecting functor Ts: The list of types for all columns. A type should be specified only once name1: Name of the first data column name2: Name of the second data column name3: Name of the third data column sel_functor: A reference to the selecting functor |
template<typename T1, typename T2, typename T3, typename F, typename ... Ts> DataFramePtrView<I> get_view_by_sel(const char *name1, const char *name2, const char *name3, F &sel_functor) const; |
This is identical with above get_data_by_sel(), but:
NOTE: Although this is a const method, it returns a view. So, the data could still be modified through the returned view |
T1: Type of the first named column T2: Type of the second named column T3: Type of the third named column F: Type of the selecting functor Ts: The list of types for all columns. A type should be specified only once name1: Name of the first data column name2: Name of the second data column name3: Name of the third data column sel_functor: A reference to the selecting functor |
static void test_get_data_by_sel() { std::cout << "\nTesting get_data_by_sel() ..." << std::endl; std::vector<unsigned long> idx = { 123450, 123451, 123452, 123453, 123454, 123455, 123456 }; std::vector<double> d1 = { 1, 2, 3, 4, 5, 6, 7 }; std::vector<double> d2 = { 8, 9, 10, 11, 12, 13, 14 }; std::vector<double> d3 = { 15, 16, 17, 18, 19, 20, 21 }; std::vector<double> d4 = { 22, 23, 24, 25 }; std::vector<std::string> s1 = { "11", "22", "33", "ee", "ff", "gg", "ll" }; MyDataFrame df; df.load_data(std::move(idx), std::make_pair("col_1", d1), std::make_pair("col_2", d2), std::make_pair("col_3", d3), std::make_pair("col_str", s1)); df.load_column("col_4", std::move(d4), nan_policy::dont_pad_with_nans); auto functor = [](const unsigned long &, const double &val)-> bool { return (val >= 5); }; auto result = df.get_data_by_sel<double, decltype(functor), double, std::string>("col_1", functor); assert(result.get_index().size() == 3); assert(result.get_column<double>("col_1").size() == 3); assert(result.get_column<std::string>("col_str").size() == 3); assert(result.get_column<double>("col_4").size() == 0); assert(result.get_index()[0] == 123454); assert(result.get_index()[2] == 123456); assert(result.get_column<double>("col_2")[1] == 13); assert(result.get_column<std::string>("col_str")[1] == "gg"); assert(result.get_column<std::string>("col_str")[2] == "ll"); assert(result.get_column<double>("col_1")[1] == 6); assert(result.get_column<double>("col_1")[2] == 7); auto functor2 = [](const unsigned long &, const double &val1, const double &val2, const std::string val3)-> bool { return (val1 >= 5 || val2 == 15 || val3 == "33"); }; auto result2 = df.get_data_by_sel<double, double, std::string, decltype(functor2), double, std::string>("col_1", "col_3", "col_str", functor2); assert(result2.get_index().size() == 5); assert(result2.get_column<double>("col_1").size() == 5); assert(result2.get_column<std::string>("col_str").size() == 5); assert(result2.get_column<double>("col_4").size() == 2); assert(result2.get_index()[0] == 123450); assert(result2.get_index()[2] == 123454); assert(result2.get_index()[4] == 123456); assert(result2.get_column<double>("col_2")[0] == 8); assert(result2.get_column<double>("col_2")[1] == 10); assert(result2.get_column<double>("col_2")[3] == 13); assert(result2.get_column<double>("col_4")[0] == 22); assert(result2.get_column<double>("col_4")[1] == 24); assert(result2.get_column<std::string>("col_str")[0] == "11"); assert(result2.get_column<std::string>("col_str")[1] == "33"); assert(result2.get_column<std::string>("col_str")[2] == "ff"); assert(result2.get_column<std::string>("col_str")[4] == "ll"); assert(result2.get_column<double>("col_1")[0] == 1); assert(result2.get_column<double>("col_1")[1] == 3); assert(result2.get_column<double>("col_1")[2] == 5); } // ----------------------------------------------------------------------------- static void test_get_view_by_sel() { std::cout << "\nTesting get_view_by_sel() ..." << std::endl; std::vector<unsigned long> idx = { 123450, 123451, 123452, 123453, 123454, 123455, 123456 }; std::vector<double> d1 = { 1, 2, 3, 4, 5, 6, 7 }; std::vector<double> d2 = { 8, 9, 10, 11, 12, 13, 14 }; std::vector<double> d3 = { 15, 16, 17, 18, 19, 20, 21 }; std::vector<double> d4 = { 22, 23, 24, 25 }; std::vector<std::string> s1 = { "11", "22", "33", "ee", "ff", "gg", "ll" }; MyDataFrame df; df.load_data(std::move(idx), std::make_pair("col_1", d1), std::make_pair("col_2", d2), std::make_pair("col_3", d3), std::make_pair("col_str", s1)); df.load_column("col_4", std::move(d4), nan_policy::dont_pad_with_nans); auto functor = [](const unsigned long &, const double &val)-> bool { return (val >= 5); }; auto result = df.get_view_by_sel<double, decltype(functor), double, std::string>("col_1", functor); result.shrink_to_fit<double, std::string>(); assert(result.get_index().size() == 3); assert(result.get_column<double>("col_1").size() == 3); assert(result.get_column<std::string>("col_str").size() == 3); assert(result.get_column<double>("col_4").size() == 0); assert(result.get_index()[0] == 123454); assert(result.get_index()[2] == 123456); assert(result.get_column<double>("col_2")[1] == 13); assert(result.get_column<std::string>("col_str")[1] == "gg"); assert(result.get_column<std::string>("col_str")[2] == "ll"); assert(result.get_column<double>("col_1")[1] == 6); assert(result.get_column<double>("col_1")[2] == 7); result.get_column<double>("col_1")[1] = 600; assert(result.get_column<double>("col_1")[1] == 600); assert(df.get_column<double>("col_1")[5] == 600); auto functor2 = [](const unsigned long &, const double &val1, const double &val2)-> bool { return (val1 >= 5 || val2 == 15); }; auto result2 = df.get_view_by_sel<double, double, decltype(functor2), double, std::string>("col_1", "col_3", functor2); auto functor3 = [](const unsigned long &, const double &val1, const double &val2, const std::string val3)-> bool { return (val1 >= 5 || val2 == 15 || val3 == "33"); }; auto result3 = df.get_view_by_sel<double, double, std::string, decltype(functor3), double, std::string>("col_1", "col_3", "col_str", functor3); assert(result3.get_index().size() == 5); assert(result3.get_column<double>("col_1").size() == 5); assert(result3.get_column<std::string>("col_str").size() == 5); assert(result3.get_column<double>("col_4").size() == 2); assert(result3.get_index()[0] == 123450); assert(result3.get_index()[2] == 123454); assert(result3.get_index()[4] == 123456); assert(result3.get_column<double>("col_2")[0] == 8); assert(result3.get_column<double>("col_2")[1] == 10); assert(result3.get_column<double>("col_2")[3] == 13); assert(result3.get_column<double>("col_4")[0] == 22); assert(result3.get_column<double>("col_4")[1] == 24); assert(result3.get_column<std::string>("col_str")[0] == "11"); assert(result3.get_column<std::string>("col_str")[1] == "33"); assert(result3.get_column<std::string>("col_str")[2] == "ff"); assert(result3.get_column<std::string>("col_str")[4] == "ll"); assert(result3.get_column<double>("col_1")[0] == 1); assert(result3.get_column<double>("col_1")[1] == 3); assert(result3.get_column<double>("col_1")[2] == 5); }