Commit b46560c0 authored by Alcides Viamontes E's avatar Alcides Viamontes E
Browse files

Merge branch 'prerelease/0.1' into develop

parents 944cf07e 2a213e9a
......@@ -11,3 +11,5 @@ UpgradeLog\.htm
WnScNewsAgent\.sln\.DotSettings\.user
maria/
WnScNewsAgent\.sublime-workspace
## Goal
Given a folder in Windows, create a program that watches a folder and notify of changed files inside that folder.
For example, let's assume we ask this program to watch the folder called `c:\webroot\static files`.
For the sake of the examples below, let's also assume that there is a file `c:\webroot\static files\wp-content\upload\Plüsch_rosa_Höschen_für_das_Eichhörnchen.jpeg` which happens to be inside that folder.
We want to know whenever files inside that folder (`c:\webroot\static files` in our example) are created, changed, or deleted, by registering those events inside a directory structure right inside `c:\webroot\static files`, as explained below.
## The changelist folder
The events are registered as lines inside files with particular names.
We call the whole arrangement "a changelist", because at the end of the day, it's just a linear list of changes, sorted from earliest to latest change.
However, we want our software to be able to read recent entries in an inexpensive way, independently of the way the folder and its files are exposed to the network.
Thus, changelists are stored in a format amenable for incremental read, distributed across files under a directory with a special name, `__sc_changelist__`, below the root of the exposed directory.
In our example, this would result in a directory at `c:\webroot\static files\__sc_changelist__`.
We use the 64 bit representation of the [Posix time](https://en.wikipedia.org/wiki/Unix_time), and create hierarchical directories
named after the hex representation of the 64-bit word in big-endian order, with the four higher octets abbreviated *as a single* `0`, and where the second smallest
octet is a file instead of a folder, including all the changes that happen in the 256 seconds it covers.
Here is an example: the changes to the filesystem between between Monday, January 22, 2018 1:45:00 PM and Monday, January 22, 2018 1:50:00 PM would be registered by taking the timestamps of those two dates:
```
1516628700
1516629000
```
converting them to hex and including the padding (to avoid the [Year 2038 problem](https://en.wikipedia.org/wiki/Year_2038_problem)):
```
0x 00 00 00 00 5a 65 ea dc
0x 00 00 00 00 5a 65 ec 08
```
We abbreviate the first four octets as a single folder named `0` in the filesystem, and
that results in the following files:
```
__sc_changelist__ / 0 / 5a / 65 / ea
__sc_changelist__ / 0 / 5a / 65 / eb
__sc_changelist__ / 0 / 5a / 65 / ec
```
In the list above, the rightmost group of each line is a file: `ea`, `eb` and `ec`.
Notice that the lowest octet is omitted from the combination of directories and file names.
It's OK if files that otherwise would be empty are missing.
In the files themselves (`ea`, `eb` and `ec` in the example above),
each the lines uses one of three syntaxes, all of which we describe using a simple grammar:
```
line ::= change_entry
| delete_entry
| reset_cmd
change_entry ::= '~' TOKEN_TIMESTAMP TOKEN_PATH_LITERAL NEWLINE
delete_entry ::= '-' TOKEN_TIMESTAMP TOKEN_PATH_LITERAL NEWLINE
reset_cmd ::= 'RESET' TOKEN_TIMESTAMP NEWLINE
```
- TOKEN_TIMESTAMP is a timestamp in hex representation, so that it's easier
to match the filename with the timestamps present inside it.
The timestamps should be given in seconds, and expressed in hexadecimal with
a resolution of seconds, that is, without truncating it the way it is done with the
filename.
- TOKEN_PATH_LITERAL is a valid normalized relative path *in Unix notation*, i.e. something like
`my/file/there`, without any `.` or `..` component, not ending in `/`, and
with a single `/` as separator. The path should be given relative to the folder we are watching.
In the example, we would use `wp-content/upload/Plüsch_rosa_Höschen_für_das_Eichhörnchen.jpeg` instead
of `c:\webroot\static files\wp-content\upload\Plüsch_rosa_Höschen_für_das_Eichhörnchen.jpeg`.
- NEWLINE is preferred as the Unix newline, but we added DOS newline parsing so that one is also OK.
- The lines in the file are sorted by the timestamps they contain, so that the
creator of the file can simple append entries to it as things happen.
- The file is encoded using UTF-8.
There are two types of entries: one that it is used when a file is created
or changed, and another which is used when a file is deleted.
And one command, `RESET`, which can be used by the changelist generator when
it starts running to tell ShimmerCat to invalidate its whole internal store.
So, here are some example contents for the file `__sc_changelist__/0/5a/65/ea`
used as an example above, assuming a German-Japanese store owner wants to upload
items branded after the anime [Banner Tail](https://en.wikipedia.org/wiki/Bannertail:_The_Story_of_Gray_Squirrel_(anime))
using a whimsical mixed alphabet[^1] :
```
RESET 5a65ec00
~ 5a65ec02 wp-content/upload/Plüsch_rosa_Höschen_für_das_Eichhörnchen.jpeg
~ 5a65ec02 wp-content/upload/睡眠マット_mat.jpeg
```
\ No newline at end of file
Copyright 2018 ShimmerCat AB
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
\ No newline at end of file
- Estimated reading time: 3 minutes
- Estimated time to complete this guide, with .NET Core installed: 5 minutes
## What is this?
Suppose you have a web application and want to serve its static
assets from a remote cache.
You want that cache to pick
up changes to your static files as soon as they happen, without having
to pay a huge bandwidth cost.
One way to setup that is to have your origin server write a log of changes
following a certain convention, and have the cache read that log every few
seconds.
Because all changes are logged in just one place, the cache can fetch them
frequently and with little cost.
WnSCNewsAgent implements the origin side of that arrangement in Windows
computers.
It is a tiny program that watches a folder in the origin
for changes, and writes a log of those changes (we call it "changelist") inside the watched folder, in a subdirectory
called `__sc_changelist_`.
We chose a format that creates small files inside a short hierarchy of directories,
so that we can fetch each of the most recent ones separately.
This keeps the network traffic incremental and fluid.
To know more about the changelist format, please consult the file `CHANGELIST_SPEC.md`
inside this folder.
This particular repository is the Windows version of the agent.
### What about the chaching mecanisms in HTTP?
The caching mechanism in HTTP/1.1 assumes that the cache or the client
checks if an asset is fresh at configured time points, e.g. using the `max-age`
cache pragma.
If you want your changes to be picked up say, at most five seconds after you
do them, you need to set `max-age` to five seconds.
This has two effects:
- For each particular asset, the origin server will get a request every five
seconds at peak traffic times, which the negates the benefits of caching.
Also, the amount of these requests to the origin is proportional to the number
of assets on the site.
For a site with lots of images and products, a short `max-age` can make
validation requests a significant proportion of all the traffic in the site.
- Most web-speed auditing tools will complain about your `max-age` times being
too short.
Chances are people in your professional orbit love those tools,
and they will forward to you the info on a too short a `max-age` setting.
The solution everybody has settled for is to set a very large `max-age` and
add query strings at the end of the
path of each static asset.
While that works, it's very hard to ask everybody to do it all the time,
specially if you rely on web application software whose source code you
don't maintain or have access to.
## How to install WnScNewsAgent
WnSCNewsAgent is a .NET program, targetting .NET Core 2.1 .
You can install it either from source or from the [FDD](https://docs.microsoft.com/en-us/dotnet/core/deploying/index) binaries we provide.
In both cases, the first step is to ensure you have .NET Core, you
can use the instructions at
https://www.microsoft.com/net/learn/get-started/windows
### Installing from source
Clone this repository:
```
git clone https://gitlab.zunzun.se/public-items/WnSCNewsAgent.git
```
and build:
```
cd WnSCNewsAgent
dotnet build -c Release
```
The binaries will be found at `NotificationWriterService/bin/Release/netcoreapp2.0`.
### Installing from FDD binaries
Download and decompress the files from ...
## Configuration
Go to the folder where the binaries are, and edit the file `appsettings.json` (you
can use Notepad or any other text editor).
Orignally, it looks like this:
```JSON
{
"WatchDirectoryPath": "c:/temp",
"WriteChangesDelay": 1000 //milliseconds
}
```
Change the entry `WatchDirectoryPath` to the absolute path to the folder where your
static assets are.
Notice that `WnSCNewsAgent` needs write access to it, or at least to the folder
`__sc_changelist__` inside that target folder.
## Running the program
You can run the program directly in a terminal, by `cd`-ing to the folder where
the binaries are and executing
```
dotnet NotificationWriterService.dll
```
In practice though, for a server environment you would want an automatic way of starting
and monitoring the service.
A popular option is to use [nssm](https://nssm.cc/), the steps are straightforward:
1. Install `nssm`. There are instructions in its home page, but basically we just copied the
binary to `C:\Windows\System32`.
2. Open an Administrator console. This can be done by opening the task manager, selecting
"Run new task" in the "File" menu, and then entering "cmd.exe" , checking "Create this task with administrative privileges", and pressing "Enter".
3. In the terminal that opens, type `nssm install WnScNewsAgent`.
A dialog appears. There, fill in "Path" the path to the `dotnext` executable (which you can
obtain by executing `WHERE dotnet` in the console). For "Startup directory" use the path where
the binaries of `WnScNewsAgent` are, and for "Arguments" use "NotificationWriterService.dll".
You can tweak other details about the service if you need to, and then finish with "Install Service".
4. Go to the Windows service manager, and start the service with the name `WnScNewsAgent`.
This is a good moment to double-check that the service is configured to start automatically.
## Verify the program is running as expected
Go to the folder being watched, and create or edit a file.
If everything is working allright, you should find a new `__sc_changelist__` folder, and nested inside
a couple of two-letter folders, a file containing a line for the file you created or changed.
## Reporting issues
If you need to report an issue regarding this program, either use your support channel to our company,
or write to `ops@shimmercat.com` using a subject that starts with "WnSCNewsAgent".
\ No newline at end of file
{
"folders":
[
{
"path": "."
}
]
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment