serlo / cloudflare-worker Goto Github PK
View Code? Open in Web Editor NEWCloudflare worker which works as a proxy for https://serlo.org/
License: Apache License 2.0
Cloudflare worker which works as a proxy for https://serlo.org/
License: Apache License 2.0
The route /user/profile/<username>
should be handled the same way as User
s are. So if User
s should be handled by the legacy system (as it is currently), /user/profile/*
must also be handled by the legacy system.
Shall https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/wrangler.toml#L65 be updated for frontend 25% (ping @elbotho @dal )
This is due to the error: "KV GET failed: 414 UTF-8 encoded length of 523 exceeds key length limit of 512." (ping @jakobwes )
Steps to reproduce:
Tested in chromium on mac
Might be related to #38
Currently we have to take care not to prefix special paths (like /auth/…
or /api/…
) with the instance, like we do for most other links.
Idea:
Maybe we could always put the instance in an query string instead (e.g. ?instance=de
) this way it would just be ignored for special paths instead of breaking everything.
See #11 (comment) (however we need to see whether this is a good idea)
@inyono would you agree, that coping the old robots.txt would make sense?
Any important changes/updates?
current version:
User-agent: *
Disallow: /
old design:
User-agent: *
Disallow: /page/revision/revisions/
Disallow: /page/revision/revision/
Disallow: /page/revision/
Disallow: /entity/repository/history/
Disallow: /entity/repository/compare/
Disallow: /backend
Disallow: /users
Disallow: /horizon
Disallow: /flag
Disallow: /license
Disallow: /uuid/recycle-bin
Disallow: /navigation/
Disallow: /authorization/
Disallow: /pages
Disallow: /uuid/recycle-bin
Disallow: /index.php/
Disallow: /index.php
Disallow: /*/entity/trash-bin
The following routes always need to be handled by the legacy backend for now:
I enabled internationalization in both dev & staging. When opening https://en.serlo-staging.dev, you get redirect to https://en.serlo-staging.dev/en which doesn't exist.
Notifications are supported in the new frontend under /user/notifications
but it seems the CF worker does not know about it and we end up with "Diese Seite existiert nicht!" in the legacy system.
Steps (all with enabled frontend):
We haven't implemented the Content API in the redesign, yet. If the request includes any Content API query param, we therefore need to use the legacy backend.
Currently, we always use useFrontend
as the cookie name. This won't have the correct behavior when we update the probability, though (since the old value is still used). We should add an experiment number to the cookie name instead, e.g. use-frontend-1
which we increment when we change the test so that users receive a new cookie value.
See https://de.serlo-staging.dev/license/detail/1, the frontend should be ready, but it's somehow not working yet, also need to check CF worker.
For SEO purposes, we shouldn't be able to access frontend.serlo.org
directly (which is the production domain in the Vercel dashboard for the frontend project). Instead: redirect to serlo.org
I have learned a lot about how to write good tests in the last months (mostly by https://kentcdodds.com/blog/ who's philosophy inspired https://testing-library.com/docs/intro). The main takeaway when it comes to mocking services is:
The mocked services shall mimic the behavior of the real services.
An example: Instead of intercepting an outgoing request to see whether it has the right authorization header we write the service in a way that when it receives a request without a proper authorization header it answers with "403 Forbidden". See https://mswjs.io/docs/recipes/request-assertions
Another similar advice is the following:
Tests should never test implementation details. When we write a test for a piece of software we should consider who the consumer of this software unit will be (a developer using the module like he would use a library or a user of serlo doing requests with his browser). How can she / he observe the feature which shall be tested? What steps does she / he need to do? Then implement the test by putting those steps into code. Looking into the module (e.g. testing implementation details) should be avoided since the consumer won't be able to do it either.
The cloudflare worker is our proxy which implements some services (e.g. rendering the imprint / terms) or does other jobs (like redirecting). A user of serlo will be the main consumer of the cf worker. Thus we need to ask ourselfs: How can a user of serlo notice a particular feature (notice that he / she has no control nor can he / she look into databases, services like api.serlo.org nor he / she can intercept requests in our network)?
Example: We want to test the api
module which adds authorization header to calls to api.serlo.org
. To test this feature in the real world (for example in our staging environment) we would simply make an api call in this environment. When it succeeds we know that everything is ok since api.serlo.org checks for the authorization header and would deny the request in case it is not present or wrong.
When we follow those philosophies we can achieve the following:
We can write tests which can be run locally by mocking not present services and against real environments like staging or dev. Thus we can reuse the tests for testing the cloudflare worker at staging and dev as well (which in a future we might want to do in a cron job to monitor those environments).
When we archive this we have some advantages:
Our tests in the cloudflare worker do not fulfill those requirements yet. However I want to change it and I want to test this philosophy here to see whether the efforts are worth it.
I want to rewrite the tests in a way that they can also be used to be executed against our test environments staging / dev. The necessary changes are:
However not all tests can be written in this way. There are some problems to solve (see next section). We might also want to test some utility function (like for example getCookieValue()
) to get confidence that it works as intended. Here we can treat it like an external library and test it this way. To make the distinction between tests for the whole cf worker and some internal modules we can have two dictionaries:
__tests__/spec
for tests which tests the specification of the cf worker and which can be also executed in staging / dev.__tests__/unit
for unit testing utility function / internal modules for getting more confidence that they work as intended. These tests can only run locally.In case we have tests which change an environment (like adding an article instead for example reading the imprint) we can flag those mutating test cases. So we can run the tests also against production but only those who do not change it. These tests can be used by a cron job to monitor the production environment and to detect bugs there fast -> so hopefully no end user will need to report a bug which we should have already covered by our tests since we already know it presence in advance we can thus fix it.
Comments appreciated since I would start to rewrite our tests in this direction. However achieving frontend 100% is more important currently 😄 Ping @inyono since this is important to know for my current PRs.
Sometimes, we want to test the integration of a preview deployment with de.serlo.org (or serlo-staging.dev). For this case, it would be useful to set the frontend url manually, e.g. by using a cookie.
Had to revert the production Cloudflare Worker to the previous working state. Homepage works, pretty much everything else doesn't, see e.g.:
There are several tasks to do:
The KV definition at https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/src/kv.d.ts#L30-L37 needs to be updated. The final definition shall be:
declare interface KV<Key extends string> {
get: (key: Key) => Promise<string | null>
put: (
key: Key,
value: string,
options?: { expirationTtl: number }
) => Promise<void>
}
The values
property of createKV
shall be deleted at https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/__tests__/__utils__/kv.ts#L23
Thus instead of https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/__tests__/maintenance.ts#L52 we need to be
await global.MAINTENANCE_KV.put("enabled", JSON.stringify(value))
Therefore all KVs need to be initialized at https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/jest.setup.ts#L56-L58
change signature of createKV
function createKV<Keys extends string>(): KV<Keys>
Similar to start.serlo.org
we want to support redirects from meet.serlo.org
to our Serlo-Calls:
Example code can be found at https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/src/index.ts#L70-L84
In our tests we often mock requests to the API:
It would be great to have a implementation mockApi()
which mocks global.API_ENDPOINT
like
mockApi(...) = mockHttpPost(global.API_ENDPOINT, ...)
The implementation of mockAPI should check that the authorization and the content type is correct. Basically it should include the code https://github.com/serlo/serlo.org-cloudflare-worker/blob/d46aad76dc0cfe8f875ed6604098f36dd6baf048/__tests__/api.ts#L39-L47
All mockHttpPost()
calls in the test code should be replaced and this function shall be deleted.
Also there should be a possibility to check for variables so that the following tests are not necessary any more: https://github.com/serlo/serlo.org-cloudflare-worker/blob/d46aad76dc0cfe8f875ed6604098f36dd6baf048/__tests__/utils.tsx#L163-L218
For the function mockApi()
the functions mockHttpPost()
https://github.com/serlo/serlo.org-cloudflare-worker/blob/1645aa0ab5924c6f4b411b865fca00142f308c2d/__tests__/_helper.ts#L92-L94 and returnApiUuid()
https://github.com/serlo/serlo.org-cloudflare-worker/blob/1645aa0ab5924c6f4b411b865fca00142f308c2d/__tests__/_helper.ts#L107-L109 shall be mergable
Todos:
mockHttpPost()
with mockApi()
(mockApi()
mockApi()
We can update node-fetch to 3.0. Since it is only used as a dev dependency, it is okay when node fetch is still beta Then we can delete the lines https://github.com/serlo/serlo.org-cloudflare-worker/blob/d587a3ecfca13e7d123dbb5d5b7136d1289d2aab/jest.setup.ts#L89-L94
At __tests__/frontend-proxy.ts
we have a function expectRequestFrom()
which checks that a request was made from the right backend: https://github.com/serlo/serlo.org-cloudflare-worker/blob/1645aa0ab5924c6f4b411b865fca00142f308c2d/__tests__/frontend-proxy.ts#L542-L555
However there is code in the unit tests which also check for successful requests: https://github.com/serlo/serlo.org-cloudflare-worker/blob/1645aa0ab5924c6f4b411b865fca00142f308c2d/__tests__/index.ts#L39-L45
So there is the possibility to create a new helper function which is an abstraction of all those test cases and which can be used to shorten the test code and to make it more readable.
Function to be tested: https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/src/url-utils.ts#L39-L41
I found one additional special case that the legacy system currently handles (and that we probably want to have in the Cloudflare Worker in the long-term). Some UUIDs are instance-specific (e.g. entities). If you refer to an entity in the wrong instance, e.g. https://de.serlo.org/93321, you'll get redirected to the correct instance instead https://en.serlo.org/93321. Basically: in your current alias query, you should also get the instance (where applicable) and then possibly redirect to the correct instance. Currently, you'd have to do that for each type specifically (since AbstractUuid
doesn't have an instance
property), ping serlo/api.serlo.org#131 as a follow-up (or preceding issue) that would simplify that.
The current way to mock a KV store is
The function adds a mock for the KV to the global namespace. With this approach we need a generic variable setting at https://github.com/serlo/serlo.org-cloudflare-worker/blob/c96e71181f854840b335edb1118add87e3917042/__tests__/_helper.ts#L67 and the line https://github.com/serlo/serlo.org-cloudflare-worker/blob/c96e71181f854840b335edb1118add87e3917042/__tests__/_helper.ts#L64 is a duplication of the definition at https://github.com/serlo/serlo.org-cloudflare-worker/blob/c96e71181f854840b335edb1118add87e3917042/src/kv.d.ts#L22-L28
A solution is to create a function createKV()
at __tests__/__utils__/index.ts
with the signature:
export function createKV<Keys extends string>(): KV<Keys> {
...
}
Then we can use this function in our test code with for example
global.PATH_INFO_KV = createKV()
See https://developers.cloudflare.com/workers/runtime-apis/kv for a doc what a KV store is.
Currently we redirect requests to /user/profile/<user-id>
to /user/profile/<username>
. However the solution is implemented in a way that a request to /user/profile/<article-id>
are resolved to /<article-path>
. Bugfix: There shall not be a redirect when /user/profile/<article-id>
is accessed. This case needs to be handled in https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/src/utils.tsx#L153-L164 in a way that the current path of /user/profile/<article-id>
is the same as /user/profile/<article-id>
.
When some handles this issue, also the todo https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/src/index.ts#L120-L126 can be resolved (after serlo/api.serlo.org#84 got merged)
We have implemented our cf worker in a way that we split the services / modules into seperate function like frontendProxy()
at https://github.com/serlo/serlo.org-cloudflare-worker/blob/c96e71181f854840b335edb1118add87e3917042/src/frontend-proxy.ts#L9
Those function return null
when they aren't supposed to handle a request so that it can be given to the next function. Instead of unit testing this behavior - for example at https://github.com/serlo/serlo.org-cloudflare-worker/blob/c96e71181f854840b335edb1118add87e3917042/__tests__/frontend-proxy.ts#L450-L454 we should find a way how we can test the right behavior as a user (see RFC at #76 ). My suggestion: We find requests which shall not be changed like requests to stats.serlo.org and we test that those requests return the right response. Afterwards we can remove all of the "returning null
" tests.
When the frontend tries to load de.serlo-staging.dev/_next/static/Xi-6lAt8rBvyzVsDMDnU5/_ssgManifest.js, it gets redirected to https://de.serlo-staging.dev/mathe/5/mathe.
Related to serlo/serlo.org-legacy#540
PageRevision
to uuid
currentRevision
to Page
Master has regressions in the fronted proxy (compare de.serlo-development.dev (master) with de.serlo-staging.dev (previous working version)), e.g.:
When trying it out in staging, I also got various flashing content / Karmilla font didn't work. Can't reproduce on dev, though, so might be caching issues.
__tests__/static-pages.tsx
is the only test suite next to __tests__/utils.ts
which tests internals of the cloudflare worker (other functions than handleRequest()
). To fulfill our requirements of #76 we need to refactor this file.
Undo c96e711 when this is hotfix is not needed any more
Follow up of #71
Currently we support embeds from YouTube, Wikimedia Commons, Vimeo and Geogebra.
For privacy reasons we want to hide our users IPs from the providers until they confirm loading of the external content.
So we want the CF Worker (or maybe even the API) to make the call instead.
The frontend could either query the full video url (example) or values for provider
and video id
or both.
The preview image code is already implemented in this PR. (video.tsx
and geogebra.tsx
)
Explanations:
const previewImageUrl = `https://i.ytimg.com/vi/${videoId}/sddefault.jpg` //might not exists -> 404
const previewImageFallbackUrl = `https://i.ytimg.com/vi/${videoId}/hqdefault.jpg`
A bit annoying that the prefered sddefault
might result in 404, but I's still better than using the YT-API with a key and all, right?
function fetchVimeoInfo() {
void fetch(
'https://vimeo.com/api/oembed.json?url=' + encodeURIComponent(src)
)
.then((res) => res.json())
.then((data) => {
setVimeoImg(data.thumbnail_url.replace(/_[0-9|x]+/, ''))
})
return null
}
Vimeo's public oembed api provides the preview URL, e.g. https://i.vimeocdn.com/video/21004662_295x166.webp
.
By removing the _295x166
part we get the full size image.
Easiest way I found:
const seperator = '/commons/'
const parts = src.split(seperator)
const baseURL = parts[0]
const filenameWithPath = parts[1]
const filename = filenameWithPath.substring(
filenameWithPath.lastIndexOf('/') + 1
)
const previewImageUrl = `${baseURL}${seperator}thumb/${filenameWithPath}/800px--${filename}.jpg`
Video link -> Thumbnail link
structure should be:
'upload.wikimedia.org/wikipedia/commons' + '/thumb' + '/1/15/{title}' + '/{size}px--{title}.jpg
This API call should provide us with the preview URL:
void fetch('https://www.geogebra.org/api/json.php', {
method: 'POST',
body: JSON.stringify({
request: {
'-api': '1.0.0',
task: {
'-type': 'fetch',
fields: {
field: [
{ '-name': 'width' },
{ '-name': 'height' },
{ '-name': 'preview_url' },
],
},
filters: {
field: [{ '-name': 'id', '#text': id }],
},
limit: { '-num': '1' },
},
},
}),
})
Follow up of #55 : We need to make sure that our cache keys are never longer than 512 characters. We can archive this by defining a refinement type like CacheKey
of string
(= strings shorter than 513 characters). Then we can define:
interface KV {
get(key: CacheKey): string
set(key: CacheKey, value: string): void
}
function toCacheKey(key: string): CacheKey {
}
Thus we need to apply toCacheKey()
to a string before we can use it as a cache key. For example https://github.com/gcanti/io-ts/blob/master/index.md#branded-types--refinements can be used.
After the user has enabled/disabled the frontend, the page could redirect the user to the home page after several seconds.
Step for fulfilling #76: Instead of having a test that our cache keys have a maximum length of 512 at https://github.com/serlo/serlo.org-cloudflare-worker/blob/c96e71181f854840b335edb1118add87e3917042/__tests__/utils.tsx#L317-L339 we should move a test that the cache key is smaller or equal 512 characters to our KV mock. It should return an error when a key bigger than 512 characters shall be written to. The tests needs to be rewritten as well. It should request the tamil ressource (which is a real site) and see whether it can be done successfully. So that we can use this test in staging as well.
Todos:
Add to https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/__tests__/__utils__/kv.ts#L31-L33 a test which throws an error, when the given key as length bigger than 512 characters. afterwards the lines https://github.com/serlo/serlo.org-cloudflare-worker/blob/master/__tests__/utils.tsx#L162-L168 can be deleted.
Reverted staging / prod to a938383. Newest dev doesn't work correctly with enabled frontend, though, see e.g.:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.