Please correct me if I misunderstand how it works - basically the input to the countinuous-eval is "dataset" (for example jsonl file, with the in and out from a "retriever" step, potentially golden data too), and one example use case is, say an existing RAG pipeline continuously spit out the data to the dataset, and countinuous-eval can continuously calculate the metrics etc.
That's correct, but let me dig a little deeper. Continuous-eval provides two types of metrics, reference-based and reference-free metrics.
In the case of reference-based metrics, you provide a dataset with the input/expected output pairs of each step of the pipeline and use the metrics to measure the performance of the pipeline. This is the best approach for offline evaluation (e.g., in CI/CD) and is the approach that best captures the alignment between what you expect and the actual behavior of the pipeline.
In the case of reference-free metrics, on the other hand, you don't need to provide the expected output, but you can still use the reference-free metrics to monitor the application and get directional insight into its performance.
Isn't the typical responsibility of the manager or the team lead?
Inquire often is followed with more requests, if each developer serves as the weekly-speaker-of-the-team, will the external partner eventually figure out "John will do anything for us and Steve is mean"?
I also wonder whether the OP should have made clear decision (and expectation) between bootstraping vs startup, because I think they require different mentality, operation style and even market. With startup he and his cofounders could focus on applying incubator and/or angel investors early on. They would get some feedback on their ideas. They could even get a little funding to hire a designer or contractor to help build the UI (which appears to be the big drag). In this case you are trading ownership to additional support systems. If you do bootstrapping, then you'd need to prepare for the longer timeframe, and financial burden. Maybe still keep the original full-time job and working par-time on the side project.
Don't know what is going on at Zoom, but I suspected at least part of it sneaky. For example, about 3 or 4 weeks ago I heard about this company, and learned that it has R&D in China per its SEC filing at IPO. However, checked its website, the career section led me to https://jobs.lever.co/zoom, and there was ONLY one opening at China per the website (I remember it was a position at marketing department). Then I searched the company in Chinese media, and saw that they were hiring all types of engineers. That made me feel uncomfortable in buying its stock. Interestingly, now you look at the same career website, and China is removed from the list of city dropdown - maybe they are cutting off or "decoupling" the Chinese R&D?
"Zoom, a Silicon Valley-based company, appears to own three companies in China through which at least 700 employees are paid to develop Zoom’s software. This arrangement is ostensibly an effort at labor arbitrage: Zoom can avoid paying US wages while selling to US customers, thus increasing their profit margin. However, this arrangement may make Zoom responsive to pressure from Chinese authorities."
This explains how their web installer which is essentially glorified malware was made and not reported on some bay area dev's blog or leaked to the press immediately. I was always surprised that a US software team could make something like that in a consumer project in a prominent company and it not be talked about for so long.
It's basically what a lot of consumer goods companies are now. All the products are made and mostly designed in China, and then the US HQ does all the sales, marketing, and funneling product requirements back to China.
Because the CCP has no qualms about threatening an employee’s family to insert a backdoor or exfiltrate information, for one.
The decoupling has begun. Sentiment in the US toward China has never been as negative as it is now, from both sides of the aisle. I wouldn’t be surprised if we even see sanctions against China after the dust settles on this COVID fiasco.
The US is no white knight. But it is not doing anything close to what the CCP does on a regular basis. When was the last time a Trump protestor or Obama protestor disappeared and was never heard from again? Or disappeared and show up again months later, 30 lbs lighter and apologetic about how wrong they were about the government?
"Starting a war to deflect from problems at home", what the heck would you call the invasion of Tibet then?
And while not literally a war, what purpose do you think the bellicose suppression of the fact of Taiwanese independence both domestically and abroad serves?
Some people seem so obsessed with the country they live in, they can't see the rest of the world properly. An over focus on domestic politics distorts everything with parochialism.
Not exactly, Tibet is a different type
of thing, it’s more like chinas’s expansion rather than Xi wanting to deflect attention from his own problems. American wars weren’t for any purpose or even any benefit for the US and lots of money was wasted and stolen through military industrial complex.
The US does not start wars to distract from problems at home. That’s conspiracy horse manure. Whether you want to believe it or not, every president that has ever started a conflict or US involvement in an existing conflict has felt the action justified on foreign policy reasons.
Those reasons might be something you object with, or even downright stupid in hindsight. But only in Hollywood is it ever a smoke screen for domestic issues.
USA, Russia, Turkey, China do exactly that throughout history. It is also an indicator of a failed/non functioning democracy, this is what surprises me about the USA (it IS a functioning democracy). I understand Russia and Turkey (been to both countries) are democracy-challenged and instead of solving their internal problems they create new external to divert the attention and seek "greatness" (one of the things that Trump* also proclaims).
Humanity needs to be great together.
Together we stand, divided we fall (said the poet).
*I don't vote in the USA so I don't care who they/you elect. If it would be Clinton or Bush (Sr/Jr) or Obama saying "screw the world we should care only about ourselves" I would be equally judgemental (you should hear me discuss politics with Russians (they can't see why their dictator is bad for them) (and now I will get downvoted by both Americans AND Russians :)
Enforcing US IP protections isn't a partisan issue. I think it's crazy that people are trusting zoom for their critical communications (like design review meetings and screen sharing schematics/process diagrams!) when there's a non-zero chance that the CCP/PLA has the infrastructure to:
inb4 the whataboutism: yes, the US Government has (and might continue to) participated in state sponsored industrial espionage. But if I'm an American company, I'm not going to care about that.
I've worked for at least one company that outright refused to do business in China or with certain companies that had oversized presence in mainland China because of experiences with this kind of problem. I know of some engineers that were arrested upon entry to the USA because they stole company IP and founded a company in China that used it. I know of another company that had network hardware compromised by an employee over there and was used to attempt to penetrate US networks (and if you wanna get spooked, they weren't alerted by their stateside infosec team, but federal authorities). I don't know why people treat me like a conspiracy theorist for bringing this up about Zoom routing data through China and using less-than-best-practice security.
But if Microsoft has teams in China, Russia work on MS Teams, I will be very concerned. The same goes with Slack, that many companies now rely on to keep business going.
I'm very sure that Microsoft (and plenty of other companies including Apple) has teams in China, Russia and other countries to develop and update proper localizations for those apps.
If you have eng in US, having a decent chunk of it in China is much less threatening. For example, your internal controls can specify code review by american employees. Your key servers can remain in America or EU with stronger privacy protection regimes (not necessarily strong; just stronger than China).
This isn't perfect, but it makes subversion (1) more difficult, (2) probably more targeted (see eg Saudi Arabia using Saudi nationals employed by Twitter to steal identities of critics on Twitter), (3) more likely to be discovered.
My company's security model doesn't include the Chinese government / national security / military, but it could include the Chinese government giving our sales leads (which are evident if you can see our Zoom calls) to a domestic competitor. Broad exfiltration of data like that is much much harder if the engineering core is in the US or EU.
What alerted me at that time was the discrepancy between their HR site at U.S of the Chinese opening (only one), and their job postings in Chinese job sites.
I doubt it is a complicated conspiracy. Someone probably pushed the wrong button on the HR site.
I’d guess the one opening you saw was coming out of a US manager’s budget, and the manager wanted some physical presence in China to help work with teams that are based there.
It’s not surprising that they wouldn’t target China-based positions in fluent Chinese language offices at their US based English language site.
Also, I’ve been using Zoom at work for years. They’re more popular with younger firms (“anything but Cisco”, maybe?).
I think OP was alluding to Zoom not showcasing the reqs on the US careers site. Most companies including Microsoft do display open positions in all countries including China on their US page.
I very much doubt Microsoft does any serious development of their core products in China. Localisation,some local support and other,less sensitive stuff.
Anti-China sentiment is being stoked heavily in the US right now. Anything China-related is widely seen as evil and/or untrustworthy. Usually these opinions are expressed on China-sourced hardware.
The R&D centers in China (I think there are 3 of them) are their competitive advantage - this can help them manage cost to be profitable from early on. I also think the engineering teams there can share with their peers of the other local Chinese tech firms. For example, China was literately put in lock down in February, and all of the business and schools went on-line in couple of days - that was a enormous achievement for anyone who participated the scaling infrastructure at the big tech companies in China, and I would not be surprised that the Zoom engineers in Chinese R&D learned couple of lessons from them. Zoom even has a feature that you can choose to "soften" your appearance to look "better" in the video conference, which has been feature of camera/image software popular in Asian market. So I suspect their product team may also have some connections there.
However, Zoom's biggest advantage is also its biggest risk, if they want to be part of the communication infrastructure of any business/government/university/etc in western countries, especially after so many security incidents that happened in the past years.
Zoom is fully aware that associating themselves with China considering the latest developments would be catastrophic for them, whether its job listings or some of its servers being in China.
What’s wrong with having R&D in China? The talent market is attractive and assuming Zoom founder is Chinese it seems more reasonable in some point compare to opening another office in US.
Lots of big tech companies has strong R&D presence in China like MSFT, GOOGL.
Btw, I might be wrong but CCP banned Zoom about half a year ago.
Is it the same company? We have "employees" around the world, but that ranges from employed by the mother ship, employed by one of the various subsidiaries (all with very similar sounding names) or not officially employed but via a local agent (for both liability, regulatory and in certain regimes, mandated).
1, back-end services with clear boundary, that decouple concerns based on dev teams' domain responsibilities, with less dependency among each other,and respected source of record. This is very much the "micro-service" is for.
2, middle tier services to consolidate or aggregate back-end APIs to serve the front-ends (especially the mobile apps) and take care of the business logic. Back-end guys all love micro-services, but someone must put them all together....GraphQL so far seems to fit this bill
3, Analytics and reporting, this is a totally different animal from the product development, and have almost opposite requirements. This is where whatever your ETL or Data Lake or Data Pipeline is used, along with your preferred BI or analytics tooling.
I may be wrong - but I have a feeling that, technology companies like apple and google are developing software on device to make user data so protected that they will say "I can not technically to crack my software" even ordered by a Judge (presumably for legit reason), thus the DOJ is using this case trying to prevent it from happening. And, if this is the case, then personally I am at the DOJ side, because I recognize this is a less ideal world (actually I think it is even worse), and this country is technically in a war.
+1
I think the problem is not homework - it is there is no other reliable and affordable "better" alternative to homework or standard test. We can all argue that homework is taking away the time kids can otherwise spend on exercising, socializing, innovating etc, but truth is they are more likely just sitting there and playing their phones.
> but truth is they are more likely just sitting there and playing their phones
I might be in a minority here, but I have an issue with that sentiment. Because before phones, it was "playing video games". Before that, it was "watching TV". Before that, it was probably "wasting time outside" or "chit-chatting with friends". There's always something else that's "wasting time", even if that "wasting" is infinitely more useful than the activity deemed as "not wasteful".
I'm from the "watching TV" to "playing video games" transition generation and I must say that I owe more to both of them than to most time spent in school.
> there is no other reliable and affordable "better" alternative to homework or standard test
I agree with that though. Standarized tests and homeworks seem to me to be an artifact of an increasingly complex society. Some of that complexity may go away though, if we advance far enough to get rid of the job market entirely (UBI, automation, etc.) - it's competition that requires standarized grades so that people can be compared with each other.
I'm from the "watching TV" to "playing video games" transition generation and I must say that I owe more to both of them than to most time spent in school.
Me too. I spent a lot of time watching educational TV programming when I was a kid. I remember the day I made the sad realization that I had seen every episode of Mr. Wizard's world and there would be nothing new to learn from the show.
I think that there is too much emphasis on standardized testing but I do think that some testing is necessary. I think it's important to have some way to measure if teachers are imparting any knowledge on their students or if they're engendering a love of learning.
My favorite teachers(with some of whom, I remain in contact) are the ones who inspired me to learn more about the subject matter than was taught in the class.