Home > Mobile >  error handling within std::tranform iteration
error handling within std::tranform iteration

Time:02-01

This question is about customisation for handling errors within std::transform's UnaryPredicate.

Parameters
first1, last1   -   the first range of elements to transform
first2  -   the beginning of the second range of elements to transform
d_first -   the beginning of the destination range, may be equal to first1 or first2
policy  -   the execution policy to use. See execution policy for details.
unary_op    -   unary operation function object that will be applied.

The standard API allows to customize a transformation logic that is happening during a single iteration. However, it is not documented how one could customise the behavior for outputing the result. Except for the requirement that d_first must be an input iterator. As a result std::transform performs 1 to 1 transformation by default. That means, thath the output range is of the same size as the input range.

However, I want to customize the behavior to ignore the output when an error has occurred. That would result in an output range of size n_original - n_errors.

Here is a code example, that parses a Visual Studio solution file string and gets a list of projects using regex. It is obvious, that the file can be corrupted to some extent, but failing on a step of extracting projects' info is not feasible - logging an error would suffice.

class VSParser
{
public:
    static auto projects(std::string_view slnFile)
    {
        std::regex pattern{
            R"(Project\(\"\{(?:[a-zA-Z0-9]|\-){36}\}\"\)\s*=\s*\"(. ?)\",\s*\"(. ?)\",)"
        };

        struct ProjInfo
        {
            std::string name;
            std::filesystem::path path;
        };

        using regex_iter_type = std::regex_iterator<std::decay_t<decltype(slnFile)>::iterator>;

        std::vector<ProjInfo> projects;
        std::transform(regex_iter_type(slnFile.cbegin(), slnFile.cend(), pattern),
                       regex_iter_type(),
                       std::back_inserter(projects),
                       [](const auto &match) -> ProjInfo
                       {
                           // TODO: handle parsing errors
                           return {std::string(match[1]),
                           std::string(match[2])};
                       });

        return projects;
    }

private:
};

The problem here is that the Ret type of the UnaryPredicate must be the same as the type resulting from dereferencing the OutputIter. So I can't see how I could manage to compile the UnaryPredicate with std::optional as a return type:

        std::vector<ProjInfo> projects;
        std::transform(regex_iter_type(slnFile.cbegin(), slnFile.cend(), pattern),
                       regex_iter_type(),
                       [&projects](const auto&)
                       {
                         // insert if not nullopt
                       }, // example. Will not compile since an it is a callble, not
                       [](const auto &match) -> std::optional<ProjInfo> 
                       {
                           // TODO: handle parsing errors
                           return ProjInfo{std::string(match[1]),
                           std::string(match[2])};
                       });

        return projects;

I know that I can do a vector of optionals and then strip it from invalid elements, but since std::optional<ProjInfo> and ProjInfo are different types, it will double the allocation and copy overhead which I don't want if can be avoided.

CodePudding user response:

Rather than use std::transform, use a similar algorithm

template< class InputIt,
          class OutputIt,
          class UnaryOperation >
OutputIt transform_if( InputIt first1,
                       InputIt last1,
                       OutputIt d_first, 
                       UnaryOperation unary_op )
{
    while (first1 != last1) {
        if(auto v = unary_op(*first1  )) {
            *d_first   = *std::move(v);
        }
    }
    return d_first;
}

CodePudding user response:

So this wrapper compiles. Example.

#include <algorithm>
#include <vector>
#include <iterator>
#include <optional>
#include <iostream>

namespace detail
{
    template <typename ContainerT>
    class optional_back_inserter
    {
    public:
        using container_type = ContainerT;
        using value_type = typename container_type::value_type;

        constexpr explicit optional_back_inserter(container_type &container)
            : backInserter_(container)
        {}

        optional_back_inserter& operator=(std::optional<value_type> value)
        {
            if (value)
                backInserter_ = std::move(*value);
            return *this;
        }

        /**
         * no-op
         * @return
         */
        constexpr optional_back_inserter& operator*()
        {
            return *this;
        }

        /**
         * no-op
         * @return
         */
        constexpr optional_back_inserter& operator  ()
        {
            return *this;
        }


        /**
         * no-op
         * @return
         */
        constexpr optional_back_inserter operator  (int)
        {
            return *this;
        }

    private:
        std::back_insert_iterator<container_type> backInserter_;
    };
}

template <typename ContainerT>
constexpr auto optional_back_inserter(ContainerT &container)
{
    return detail::optional_back_inserter<ContainerT>(container);
}


int main()
{
    std::vector<int> vec{4, 8, 15, 16, 23, 42};
    std::vector<int> output{};

    std::transform(vec.cbegin(), 
    vec.cend(),
    optional_back_inserter(output),
    [](int i) -> std::optional<int>
    {
        if (i % 2)
            return {i};

        return std::nullopt;
    });

    std::cout << vec.size() << std::endl;
    std::cout << output.size() << std::endl;

    return 0;
}

This is "more verbose" but is more about separation of responcibilities. By customizing only the outputIterator's behavior I remove the necessity of messing with the transformation logic by itself. It is SOLID-friendly, since I don't need to modify the general logic, rather I can provide a customisation object in any other part of the code.

  •  Tags:  
  • Related