HTTP REST request and response to alter many-to-many collections?-CodePudding

What are the correct HTTP requests and responses for adding and removing items from a many-to-many collection?

Note that while it is possible to update a collection by using PUT or PATCH to replace the entire collection, I wish to be able to add or remove a single resource from a collection. My reasoning is 95% of the time this is the case, there are performance issues if the collection is large, and it just seems more complicated and harder to troubleshoot replacing all of them.

While my specific questions on this topic are listed below, if I failed to include others which are important to implementing REST with many-to-many collections, please include as you see fit. The following questions refer to a hypothetical Book/Author/Publisher/Distributor model described below.

How should one add an existing book to an existing distributor? Maybe POST \distributors\{distributor_id}\books with {"book_id": 123} in the body or PUT \distributors\{distributor_id}\books\{book_id} with an empty body? Should the response be the updated book or distributor, the updated book or distributor's collection, or something else?
How should one remove an existing book from an existing distributor? Note that the book is not being deleted but only removed from the collection. Maybe DELETE \distributors\{distributor_id}\books\{book_id} with an empty body? Should the response be an empty 204, the updated book or distributor, the updated book or distributor's collection, or something else?
If a many-to-many relationship results in another entity (such as DistributorPublisherContract), should it be handled the same as ones without entities (i.e. book_has_distributor), or should it be handled similar to other entities which have identifiers?

The following questions might be considered slightly off-topic and no need to answer if you feel they are (but would still appreciate a comment):

To remove an author from a publisher, should DELETE publishers/{publisher_id}/authors/{author_id} be implemented or just do PUT /authors/{id} and set author.publisher to null?
Should one-to-one relationships (i.e. Spouse) typically be objects and not URLs in the request and response?
How important is it to implement routes to both create a new resource and also create a new sub-resource which is a member of the first resource's collection?

Example Data

Business rules are:

Each book must have an author and an author can have zero or many books.
Each author can be unmarried or have one spouse, and if an author divorces her husband, the husband doesn't exist any more.
An author may only have zero or one publisher, however, the publisher can work with many authors.
Many distributors distribute many books.
Many distributors have contracts with many publishers, they can have only a single contract with a given publisher, and the contract date must be saved.

My database schema looks like:

book
 - id (PK)
 - author_id (FK, NOT NULL)
author
 - id (PK)
 - publisher_id (FK, NULLABLE)
 - spouse_id (FK, NULLABLE)
spouse
 - id (PK)
publisher
 - id (PK)
distributor
 - id (PK)
book_has_distributor
  - book_id (FK, NOT NULL)
  - distributor_id (FK, NOT NULL)
distributor_publisher_contract
  - id (PK) [or use composite book_id/distributor_id PK if desired]
  - book_id (FK, NOT NULL)
  - distributor_id (FK, NOT NULL)
  - contractDate (datetime)

My entities are: Book, Author, Spouse, Publisher, Distributor, DistributorPublisherContract

Background Context Just provided to document my understanding and hopefully help others.

Basic requests are:

Action	Method	Path	Body	Type	Response	Status Code
Get resources	GET	/books	[empty]	collection	Array of books	200
Add a resource	POST	/books	Book object	collection	Book object	201
Get a resource	GET	/books/{id}	[empty]	item	Book object	200
Replace a resource	PUT	/books/{id}	Book object	item	Book object	200
Replace properties in a resource	PATCH	/books/{id}	Partial book object	item	Book object	200
Delete a resource	DELETE	/books/{id}	[empty]	item	[empty]	204

The following are redundant and can be performed by other means:

Add a new book to an existing author - Handled by POST \books since author's URL is in the body and the server will add it to author’s collection.
Remove a book from an author - Handled by DELETE books/{id} since server will remove it from author’s collection.
Remove an author from a publisher - Handled by PUT or PATCH authors/{id} with publisher as NULL in the body since server will update publisher’s collection.
Add a new book to an existing distributor - Handled by POST \books since distributor's URL is in the body and the server will add it to distributor’s collection.

CodePudding user response：

As REST is more a style of designing the things than a strict protocol, you should consider the following factors before implementation:

End user experience.
Mental model and industry application practice.
Handling of the corner cases.
Non-functional reqs.

End user experience

API developer has to clearly visualise it's client type (web/mobile/native/embedded/...), network conditions (firewalls, CDNs), network protocol limitations(header/body/query sizes). Some gateways or WAF limit allowed HTTP methods to be only GET & POST for example. Information about the desired method is usually passed in a header then.

Single page applications usually require updated object to be returned in POST response in order to be able to render it without additional request, and some clients, do not need this information at all.

Mental model

It would be nice if the API is easy to understand without any diagrams, just clean logic. Also, try to find a similar well-known API and apply it's design to your's one. Check pros & cons. For an end-user, consumer of your API, it will be easier to adopt an approach that they are familiar with.

Corner cases

How the empty values, out of bounds values, incorrect references should be interpreted and treated?

Non-functional requirements

Here, basically, I'm talking about the speed of work and the amount of data. How many requests should be done to complete one user operation and what amount of data will be passed through the wire? Please take a look at this brilliant article, especially at the part dedicated to caching.

Anwering your questions

First, your data seems not 100% correct to me:

distributor_publisher_contract table does not have publisher id
author may have more than one publisher (in life), it's not a nullable one-to-one relation.

The most straightforward way is to create a contract between the author and the distributor on a book (or books, then it would be a batch operation if we are talking about strictly normalised DB).

POST /authorDistributorContracts

{
    "distributorId": "123",
    "authorId": "234",
    "bookIds": ["a", "b", "c"]
}

REST API does not need to reflect the design of your internal data structure. You can have POST for authorDistributorContracts but you may have no GET for it, because the relations will be reflected in the JSON of Author Book and Distributor entities directly (as references).

What to return in the response here?

Return the object you created with the POST request or simply it's id. Details here: https://stackoverflow.com/a/19201805/1333262


{
    "id": "auto generated id here",
    "distributorId": "123",
    "authorId": "234",
    "bookIds": ["a", "b", "c"]
}

DELETE /distributors/{distributor_id}/books/{book_id} is quite reasonable option. Also if your API is transparent for the end user, then DELETE /authorDistributorContracts/{contractId}/books/{bookId} looks also good. In the first case the backend should first find the contract of the distributor of a book and then update the object / table record, or if it's even more noramlised, then the table AuthorDistributorContractsBooks.

Moreover, you can define a PATCH operation with your own syntax inside, for example

PATCH /distributors/{distributor_id}

{
    "operation": "DELETE_BOOK",
    "bookId": "a"
}

Which method is better needs to be decided based on the factors given in the beginning.

I would recommend to always use id where it's possible. It can be later used for logging / debugging / troubleshooting on both server and client sides.

Extra questions

PUT is used when the client provides all the properties of an object, it's not the case here if you want to just set the value of a single property. DELETE /publishers/{publisher_id}/authors/{author_id} would work here. Or an approach with PATCH described earlier as well.
Not sure I understood the question.
In "normalised world" all entities are equal, otherwise they are not entities but just embedded values. Decide whether you stick to normalisation(e.g. relational databases imply that, in general) and show this to your end user, or embed some objects into others (document databases like MongoDB). Back-end logic can do whatewer you like. You can have API that represents resources as documents with embedded collections/documents inside and Postgres as the backend database with absolutely normalised structure. And viceversa. "Normalised mode on": you have 2 separate endpoints to create resources and a third one to create a link between them. In this case backend logic is lighter, because the client decides when they want to create each entity, and resolves possible desynchronisation issues. "Normalised mode off": you have one endpoint which creates parent and child objects and links them together in one call. Backend logic is heavier here, becase server takes more responsibility on itself.